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introduction 



Classification and classification systems have formed the foun- 
dation of retrieval systems since man first began to record 
knowledge. Good histories and descriptions of classification 
schemes are few and far between, and usually little appears in 
them to explain the whys and wherefores. But obviously man 
has recognized the need to organize in order to retrieve. 

Americans are not particularly diassification-minded, as Mr. 
Stevenson will point out later. One of the great anomalies of 
American classification is the Library of Congress Classifica- 
tion, which has little to say for itself philosophically except that 
it works. One of my favorite quotes cornes from Phyllis Rich- 
mond who once wrote of the Library of Congress Classification- 



In a discussion of classification research, the Library of 
Congress system does not fit any of the categories de- 
scribed. It is a pragmatic, functional system that is widely 
used with considerable consumer satisfaction. It is not logi- 
cal, it is not scientifically or probabilistically built; it has little 
to dc with language or linguistics other than to provide the 
best classification of these subjects extant; in organization it 
sprawls in all directions; 't violates all the postulates, princi- 
ples and laws that are considered important in classification 
making; in some areas relationships are shown in hierar- 
chies, but throughout most of the schedules nothing soems 
to be next to anything for any particular reason; yet it/jrows 
steadily without any serious sign of stress. Why does it 
work?' 



Americans have been inclined to leave classification at just that, 
as long as it works, that is all that counts. 
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Classification In the United States has developed quite 
uniquely. The so-called traditional schemes to which we are 
wed have been designed and used more as browsing tools or 
shelf organizers and hence tend to classify only generally. This 
has caused a one place on the shelf-one place in the scheme 
philosophy. The Europeans and Asians have used classification 
to organize concepts, rather specifically in indexes (classified 
catalogs), in order to retrieve information, not an item". This 
difference in approach has indeed influenced every aspect of 
classification around the world. 

Quite a few people have posed the question, especially now ^ 
that automated bibliographic control is becoming a reality. Why 
bother? Let the computer do it. Classification is dead! There 
are other ways to access information. I leave this as a potential 
hypothesis — not yet researched or validated. Maybe because of 
my cataloger s inbred loyalty to classification as a self-evident 
truth, I assume classification is very much alive. 

The purpose of this issue is to discuss classification today 
(primarily in the United States), with some insights into the di* 
rections of the future. There is no attempt to be comprehensive, 
thorough, exhaustive, etc. The authors have been asked merely 
to put some of their ideas and thoughts down. This is not a 
state-of-the-art, it is not a history, it is not a how-to-do-it man- 
ual for classifiers. One paper attempts to define classification 
and provides the theme of the issue. The historical paper in- 
tends to set the stage and indicate major trends. From a 
theoretical point of view both the traditional and the modern at- 
titudes and characteristics toward classification are sum- 
marized. With the theoretical framework provided, the Dewey 
Decimal Classification, Library of Congress Classification and 
the Universal Decimal Classification are examined. And finally, 
there are two papers on the future — automatic classification 
and research. Admittedly this is a rather loose framework but it 
has allowed the reader an opportunity to see where American 
classification stands. 

The working classifier may find the papers interesting and in- 
formative, perhaps reinfor'^ing. The student may find them an 
introduction and summary on which to base further explora- 
tion. The researcher may not need really to dwell on them 
much at all. The papers are offered to the general librarian, not 
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the specialist, with the hope that they may stimufate interest 
and improve awareness of the heart of the retrieval problem- 
classification. 

Anf)F. Painter, PhD 
Professor 

Graduate School of Library Science 
Drexel University 
Philadelphia, Pa. 



1 Phyllis A- Richmond. Some Aspects of Basic Research in Classifica- 
tion. Library Resources & Technical Services 4 (Spring 1960). 139-147 
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Harris Shupak 



1 



The point of a definition is to provide a target for a concept, 
allowing the specification of its purpose, and in this specificity 
to label one or more variables of the universe. The task is for- 
midable» especially as so many layers of meaning are impli- 
cated In the descriptors we use, with attendant expectations of 
their worth. With concepts becoming iong-standing practices, 
this challenge is even greater, in this sense, I wonder if classifi- 
cation can be defined at all! With this warning given, however. I 
shall launch into my subject. Rather than give an Initial defini- 
tion and then attempt to prove why it is more or less true than 
other definitioris. I shall illustrate various aspects of what I con- 
sider to be the practices of classification. Thus, my method 
should, with luck, back into the central problem of the paper 

A curious fact of our history on earth is the rise of stratified 
classes within human society, classes often based on the ex- 
clusive possession of differentiated skills deemed to be of sig- 
nificant value to these groups. Once these distinctions occur- 
red, man no longer remained coequal with other men but, to 
paraphrase the words of a noted analyst, "some men became 
more equal than others.*" Stratified classes were early indica- 
tions of man's ability to. perceive distinctions and order his uni- 
verse around them. Seen collectively, these stratifications are 
nothing more than the universe of his existence. Taken sepa- 
rately, they are the basis for hierarchical rankings and subdivi- 
sion of that world. 

Another example of what I may be allowed to call man's inher- 
ent process of artificially ordering the world he perceives, and 
one that illustrates another facet of the discussion, is the world 
of kinship systems in non-Western societies. In addition to the 
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forces of stratification, languages of kinship became for- 
malized, precisely indicating the levels of relationship between 
branches of a family and its individual members. If an analogy 
can be made, it would be this; from a perceived difference 
(based on artificial cnteria) between societal members, terms of 
address in kinship languages formalized these distinctions, giv- 
ing notational relationship between these individuals. If some 
tribes locked these patterns in too rigidly, then at what cost 
would personal initiative have to suffer in order to break the 
deadlock of these expectations? Then again, with forces of dif- 
fusion and dispersion so widespread through this aged world, 
what changes were made to the set of these kinship orders'^ 
Could societies change the basis of kinship expectations with- 
out changing their stratified orders? This point, perhaps imagi- 
nary. IS made to demonstrate the complexity of ordering and 
changing the universe of man s perceptions. Our classification, 
as a philosophy and a practice, stands in the same proportion 
of difficulty as these anthropological phenomena. 

In this paper, I wish to illustrate classification as a process of 
naming and ordering this universe, but not merely an activity 
solely directed to some objective world of knowledge The his- 
torical process of a|rat!fication-classificalion has been one of 
advancing Knowledge as our understanding of natural and arti- 
ficial orders has increased— to give new relationships a rightful 
and accurate place in the scheme of things. We have had to 
compare these changes to hierarchical orders previously con- 
structed. In this way. changes in our classified orders came not 
' in scattered bits, or bytes, as it were, but as alternatives to the 
hierarchy of established facts— and hence to knowledge itself 

A carefully stratified order moves continuous time mto separate 
epochs, thus, the extrapolation of time and circumstance was 
given definite boundaries Within each epoch, alterations could 
be observed m terms of that specific time period, with each one 
having its own level of development, contrasted with other 
ages, ordered and classified according to these distinctions 
This was our heritage of intellectual classification, changing in 
Its sophistication as our accumulation of facts increased The 
order it created became the foundation for comprehending the 
universe of knowledge 

Why do menTiave difficulty introducing radical change into 
their classified orders? Why has the existence of intuitive leaps 
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been such an important device for accomplishing these 
changes? After hypothesizing and experimenting, sifting the 
studied relationships carefully, imagining through these con- 
cepts the new possible orders for them, after all this, how often 
do the gaps still exist? Man cannot consciously finish the job. 
ihe intuitive leap--that process of comprehension so slightly 
beyond the conscious world— accomplishes the extraordinary 
feat of Interpolating these facts into the order that was not 
quite within reach of the thinker. With intuition, old classifica- 
tions are destroyed and new ones created. These nev/ classifi* 
cations of phenomena, discovered in hard thinking and timely 
serendipity, have their own language, classes, their distinguish- 
ing characteristics from former cJassificattons and their points 
of duplication. In some cases. ihe terms gaining access to 
these classifications arc absolute — accurately part of the new 
order itself, unrelated, in its essentials, to previous classes and 
terms. In other cases, the terms of reference will be fuzzy, ques^ 
tionable. almost belonging to one or another class and to sev* 
eral different classifications. In gaining the best access to the 
hierarchy, how can one be sure of the accuracy of terms? In an 
absolute order there is no confusion. In separating a homo« 
geneous world, however, how can man's order duplicate 
a natural order with the tame degree of perfection? Therefore, 
no order can be absolute, it is only a temporarily derived stage 
for viewing the accumulation of facts to date. Yet. as a process 
special with man. is it not fascinating to recollect that our prog- 
ress as a species was so fast, accurate, and unstoppable be* 
cause we had gamed control of such a power as classification? 

If I have not explicitly defined anything yet. you may see that 
the difficulty rests with trying fo pinpoint an activity so perva- 
sive in man's growth and history. One of the large conflicts in 
thinking of our library classification, in contradistinction to the 
intellectual process I have described, is in deciding what clas- 
sification really represents. When we invoke a Dewey Decimal 
number, are we seeing a pattern, a piece of the classified order, 
falling into our comprehension? Are we handling a representa- 
tive from that order in the form of a document in which part of 
that classified order shall be revealed? Or. is it merely a place 
reserved for such a representative? The differences between 
these ideas can yield three definitions. 



Historically, it may be said that classification is a process and 
an act of ordering and differentiating the universe, yet classifi 
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cation also mirrors philosophy, has its own terminology and is 
an organization of places to store things, be they ideas or 
documents. The mixture of these elements has caused a reign 
of confusion as to what purpose classification should servo. 
The Baconian influence on library classificcfti6i| has been well 
documented. Bacon's era extended library ^issification from 
the "art** of making philosophical charts of tire universe (then 
at Its apogee) to a conjunction, on a cosmic of collected 
bits of a developed process of mental classificati6n. One re- 
spects Bacon for his ability to do this so beautifully, com- 
prehensively and lastingly. Several hundred years later, through 
interminable changes, influences, and practices, the Baconian 
universe met Melvil Dev/ey, and Ihere— in one of the more im- 
portant historical events of man's history of classification— we 
find the first significant and lasting admixture of philosophical 
perception and the rather mundane practice of storing docu- 
ments in libraries. In this encounter, a question was created 
that has not been solved. What is classification? Is it the prac- 
tice of philosophical differentiation I have been describing or 
the art of accurately storing documents with a mind toward ef- 
fective retrieval of related pieces of information? 

Dewey lifted a sagging world of shelvers and card catalog mak 
ers. and gave librarians a chance to participate in the com- 
prehension of rarity and beauty— the worlds of philosophy, the 
mirrors of man's universe, determined through these spe- 
cialized perceptive and cognitive abilities Henceforth, when 
we hear about the classical debates of v/here to place certain 
documents in the Decimal Classification, it is not merely that a 
question of location is being argued. Indeed, one has the feel- 
fng that the interlocutors were questioning the inherent order 
of the universe itself. Why else would these practical storers of 
information give so much heat to the argument? Even though it 
was dealing with the documents of knowledge, the welding of 
philosophy into the Decimal Classification made it a process 
whereby the universe was divided into identifiable classes, 
further subdivided by these perceived differences, and given an 
appropriate notation for retrieving the documents of this order 
This may be a simplification, but I am illustrating classification, 
not giving a manual for the Dewey Decimal Classification 
(DDC). Later elaborations of this process found more enumera 
tive schemes, m which this universe was explicitly developed 
for document retrieval and the collation of related materials 
The intent was to integrate materials as they were being stored 
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An ironic fact about the DDC is that it has managed to persist 
so long, undergoing numerous reordering of its scheme and yet 
remaining representative of our changing world, cohesive and 
contemporary. It is this duality of purpose that has given it so 
many problems. Perhaps it is being made to do too much? 
Perhaps it is only feasible for a limited point in time?. 

In beautiful opposition to the philosophy and practice of DDC. 
we have a modern science of classification which explores 
another extreme. It is called subject analysis, faceted classifica- 
tion, developing in its glory as a computerized operation. 
ginning with a universe of discrete facts, ideas or subjects, it 
seeks initially to abolish formerly perceived classes, and substi- 
tute for the old method of finitely breaking down the universe, 
one which minutely orders these facts, bringing related aspects 
of documents together into classes which represent the co- 
occurrence of terms as analyzed in the documents themselves 
It is a process of building up the classification from these facts, 
or facets, without seeking to create a complete universe. With 
this method of classification, philosophy has been returned to 
philosophers. Computerized classification offers an opportunity 
to relate conceptually the documents of knowledge much more 
precisely. The classifications are not stable, but change fre- 
quently with the reordering of subjects in these documents. In- 
deed, one would wonder whether this is really classification, as 
It seems so antithetical to the progress of mental classification 
with which I began this paper. 

These classifications are interesting m principle, but significant 
costs wHI have to be assumed to perfect their development as 
tools of classification. It is generally assumed the faceting can 
work for certain small classes of documents, but with general 
library collections they would be useless. In some experiments, 
classes created by computer algorithms had to be combined 
with classes from a traditional classification in order to reduce 
the amount of irrelevant materials retrieved. Even v/here com- 
puters can create such classifications, other practices must be 
appended— like in-depth indexing systems, independent nota- 
tionai systems for storing documents and the great interpretive 
involvement of fibrarians. 

Are there any commonalities between these contrasted classifi 
cations that would allow us to define classification? Yes, one! 
In both cases classification must become an ordering scheme 
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for locating these documents. At ali costs, whatever scheme we 
Invent or use. the classification must be able to perform effi- 
ciently at this central task—locating the document for the user 
Modern classificationists are attempting to take this qualifica- 
tion and to say. "If machines can do it faster, though not more 
comprehensively, and on an average perform as well, then our 
methods must be equally as valid. Not better, but at least as 
valid." For whatever slight cost advantage is given by this per- 
ceived equality, these classification systems will be developed 
for special collections, perhaps ohe day for general library use 

It must be said again, therefore, that whatever the method, the 
finite dispersion of the universe or the monolithic creation of a 
world of discrete subjects, we still have to store and retrieve 
these documents. I think, now. that my definition of classifica- 
tion is apparent, it is the activity of storing documents for re- 
trieval. No order is complete without a basis for distinguishing 
and differentiating the documents, but the locating and storing 
function— the notational device— must be independent of that 
scheme. I could conceive that, given the ability of successfully 
storing and locating documents, any scheme in the future 
might be adaptable to that purpose. So, in the end. I have of- 
fered a rather unstartling definition of classification. Is it reflec- 
tive and worthy of my argument? 

The problem is that I am forced to uphold it, but I do not fully 
believe it. Classification was initially described as a mental pro- 
cess of ordering the universe, and we have taken our library 
practice and reduced it to a mere act of storing and locating 
documents in a collection of materials. On one hand we can 
speak of classification In the highest sense— that which hier- 
archically orders the universe— allowinjg us to proceed from 
concept experience to concept experience, revising our 
categories, but building up our knowledge as a consistent at- 
tempted representation of the universal order. On the other, we 
merely speak of our library classification as a shuffling device, 
devoid of th4t presence which exists when the two are com- 
bined, as in the original encounter of the Dewey Decimal Clas- 
sification and Baconian philosophy. Yet, our age has a new 
philosophy, and its herald, the computer, allows for quick, sub- 
tle and efficient manipulations of ideas, facts and subjects. If. 
therefore, my definition of classification is unsubstantive. I feel 
it must remain so. We have a different age upon us, and what 
have existed as inherent mental processes are changing, offer- 
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ing us new and different experiences and practices. The human 
process of classification and its substantiation inUhe library will 
mirror these influences. It is only a question of whether we will 
find the same kind of welding in library classification that will 
give us a new and unique opportunity to classify our docu- 
ments and store them In the same mode, or some kind of dispi- 
rited shuffling system that is efficient, but lacking in human 
dynamism. 
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The Historical Context: 

Traditional Classification Since 1950 

Gordon Stevenson 



introduction 

Twenty-fivo years ago. when librarians in the United States 
spoke of classification, they were usually referring to two 
specific library classifications: the Dewey Decimal Classification 
(DDC) and the Library of Congress Classification (LC). The 
habit of confusing the general idea of library classification with 
the possibilities and limitations of DDC and LC had been 
characteristic of United States librarians for generations. A 
clear distinction was not made between general principles of 
the nature, structure and uses of library classification and the 
application of these principles in specific systems. This ap- 
proach to classification enshrined DDC and LC somewhere, 
near the center of llbrarianshlp. Even today, our problem is, not 
so much classification as what we think classification is and 
how we think about it. The way we thought about classification 
around 1950 was such as to give DDC and LC a legitimacy and 
permanency of the sort usually reserved for religious texts and 
sacred rituals. Unfortunately, this approach is still found to a 
great extent today; and though DDC and LC seem to be oven 
more Inextricably embedded in United States llbrarianshlp than 
ever, it is flow necessary to identify these two systems'as **tradi- 
tionalxlibrary classifications.*' They must also be Identified as 
■ general classifications," because they know no subject limita- 
tions. / 

DDC and LC are traditional in an historical sense bofcause their 
roots are deep in the past, and in a practical sens^^ecause 
they are used by librarians today In essentially the pame way 
they Were used when they were Introduced before and shortly 
after 1900. They a^also traditional because of t^ir structures. 

O . Gordon Stevenson is /Associate Professor, School of Library and Infor- 

E Bs^C mation Science, State University of New York, Albany. ^ q 
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They are internally structured with mutually exclusive, enumer- 
ated classes that are arrived at by a logical process of division 
that proceeds from broad concepts and disciplines to ever nar- 
rower and more specific subclasses. Since 1900, these systems 
have changed; but most changes have been quite superficial in 
terms of classificatory techniques. New classes have been in- 
troduced, finqr subdivisions have been made, and old classes 
have been rearranged. But the traditional systems employ no 
basic structural or classificatory device that was not known be- 
fore 1900. 

In 1950, Jesse Shera critically evaluated the traditional classifi- 
cation schemes aod the principles on which they are based.^ In 
doing this, he succinctly defined their parameters and clarified 
the difference between traditional and nontraditional systems. 
In the meantime, we have learned a lot about classification and 
its theoretical and practical foundations. The past several dec- 
ades have seen a more intense examination of the foundations 
of classification than any other period since the last quarter of 
the nineteenth century. The results of these investigations, ex- 
periments, philosophical speculations and theories have 
created an ever-widening gap between the traditional systems 
and the newer, modern systems. It is the purpose of the present 
review to consider the two traditional systems in their historical 
context arid to comment on the idea of general, as opposed to 
special, library classification. With DDC and LC, we are dealing 
with two dinosaurs that one would have thought could not sur- 
vive into the second half of the twentieth century. They would 
appear to be relics of the past, and their survival— indeed, their 
continuing vitality—raises important questions about the nature 
and uses of classification by librarians in the United States. It is 
the contention of the author that it is impossible to understand 
the condition of classification in the United States today or to 
speculate intelligently about its future without an historical 
perspective. 



Ciattiflcation Around 1950 

General library classification as we knew It around 1950 was a 
product of decisions made around 1900. Expectations about 
the contributions of classification to subject control and ac- 
cess. Ideas about the structure of classification systems, and 
general agreement about wha^ constituted a proper subject 
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catalog had long since been formalized and incorporated into 
the conventional wisdom of llbrarlanshlp. A key historical event 
of almost unprecedented importance in the history of subject 
access was the rise of the dictionary catalog and the subse- 
quent disappearance of the classified catalog from United 
States libraries. After that happened, the way we thought about 
classification and its uses changed fundamentally. By 1950. 
most librarians m the United States were not quite sure what a 
classified catalog was for or how it was different from an al- 
phabetical subject heading catalog. Why this happened and its 
long-range impact on both classification and subject access are 
historical questions which have never been answered Added to 
this fundamental change in the use of classification was the 
phenomenal dispersal of DDC and later LC. All competing sys- 
tems were swept aside and these two became such monumen- 
tal edifices that they have never been seriously challenged in 
the United States. By the time Bliss published the final volume 
of his Bibliographic Classification In 1953.2 hardly anyone in the 
United States took his work seriously. It is very possible that the 
Bibliographic Classification was a better classification than 
both DDC and LC. but it was published too late to have any 
practical impact in the United States. 

As late as 1950. many, if not most, library schools In the United 
States taught all students the DDC system and saved LC for 
those hardy-students who went on to take "advanced catalog- 
ing." It did not occur to anyone that there might be an alternate 
to DDC and LC. Most of what we knew about general princi- 
ples, we- learned from Berwick Sayers. the British clas- 
sificationist and teacher.^ but his work had some limitations It 
was not until the German translation of E. I. Samurin's monu- 
mental history of classification was published in the late 1960s 
that we had access to a coherent survey of the great European 
systems in the full sweep of their historical evolution.-* But by 
the time Samurin's work was accessible in the West, few librar- 
ians in the United States were interested in the European sys- 
tems. For most of us. classification in Europe was then, as it is 
now, a closed book. The reasons for thVs are also buried in our 
past. In the early decades of the American Library Association, 
there was a lively spirit of internationalism and an exchange of 
ideas about cataloging and classification. This ended in 1914 
for reasons which are obvious and have nothing to do with lib- 
rarianshlp. Since then, we have exported librarianship but have 
assumed that there Is little worth importing. The new inter- 
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nationalism in descriptive cataloging that emerged from the 
Paris Conference of 1961 has not led to similar trends in sub- 
ject cataloging or classification.^ 

The complete dominance of DDC aad LC was due in a large 
measure to the long-range trend to centralized cataloging and 
national standardization. Improved subject control and ^ccess 
were not the only issues involved in this trend. Another v/as the 
rising costs of all technical services. When librarians thought it 
necessary to make a choice between DDC and LC, the overrid- 
ing criteria that influenced their decisions were the economic 
consequences of the two systems. After 1950, the role of the ' 
library manager in making classification decisions increased. 
The spirit that animated change was made clear by Raimund E. 
Matthis when he said» **We must opt for the most workable tool 
at present available to carry forward the mundane but needful 
task of moving books and records from catalog department to 
shelves and catalog."® , 

With decades of the neglect of classification behind us, it was 
easy to accept without question the mystiques which began to 
surround DDC and LC. Of the two systems, we learned more 
about DDC than we did about LC. The massive size of the Li- 
brary of Congress, its central role in national bibliographic con- 
trol and its formidable staff of subject specialists gave it such 
an awesome authority that few librarians even considered sub- 
jecting the LC system to a serious, Ih-depth evaluation. Fur- 
thermore, the belief that we knew and understood the historical 
origins of the LC system was a myth. The extent to which LC is 
based on nineteenth-century European systems has not yet 
been documented. A reading of the works of Rudolph Focke 
casts doubts on much of v/hat we think we know about LC's 
origins.^ In developing a classification code, Focke drew up a 
series of rules which, when compared to LC, describe the 
foundations of that system .quite precisely. It must say sorVie- 
thing about the LC system that Focke's code was written, not 
for shelving systems, but fo^ the sort of classified book catalog 
common in German libraries.around 1900. The historical impli- 
cations of this are fascinating and we await a thorough study of 
LC*s origins. \ 

The DDC system* on the other hand, has been under almost 
constant critical scrutiny since its first edition in 1876. We are 
also reasonably well-informed as to its history. The fate of DDC 
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after 1950 has been one of the strangest chapters in the history 
of modern iibrarianship. Today it is used by around 25,000 li- 
braries throughout the world, and at least a third of recent edi- 
tions have been sold outside^ the United States. A complete 
French edition was publisheaHjiis year. The worldwide im- 
pact of DDC continues to pick up momentum. But in the 
United States, with the publication of the fifteenth edition in 
1951, DDC was thought to be dead or dying. The fact that this 
edition, despite obvious limitations, began to bring DDC into 
the twentieth century was overlooked as librarians resisted 
changes which would require extensive reclassification. A dec- 
ade and a half later, with the publication of the seventeenth 
edition, reactions In the United States were even more dlsaster- 
ous. We do not yet have a complete account of the extent of 
the erosion of DDC in the United States, but it appears to have 
been massive. In the mid 1970s, we are getting scattered re* 
ports of high school libraries changing from DDC to LC. 
Whether this change has been good or bad remains fo be seen. 
However, it is ironical that DDC has been improved, but the 
changes necessary to make improvements have weakened its 
hold on Iibrarianship in the United States. The use Qf DDC in 
the British National Bibliography (1950- ) has been entirely 
beneficial and has helped to bring British classification experts 
into the editorial apparatus that guides the future of the system. 

Reevaluation of Traditional Systems 

While the library world at large went about working with the 
traditional systems, the complexity of the postwar world began 
to have an impact on the thmking of the more perceptive librar- 
ians in the United States and abroad. In the early 1950s, the 
most fundamental questions were raised about the very founda- 
tions of the traditional systems.and about the validity of any 
general system of shelfiXlassification. Criticism of library clas- 
sification was nothing new. but never before had fundamental 
principles been so incilslvely examined. By 1950. Margaret Egan 
spoke of the "ferment over classification.'*® The impending im- 
pact of the computer, the dispersal of the ideas of Ranganathan 
and the tremendous increase in the production of scientific lit- 
^erature had an impact on how librarians thought about classifi- 
cation. There was a sense of urgency for the solution to prob- 
lems of bibliographical fJontrol. Within this context, DDC and 
LC were examined and found to be grossly inadequate to deal 
with information (n the modern world. 
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With a firm commitment to the fundamental role of classifica- 
tion in the organization of knowledge, Shera wrote a devastat- 
ing critique of traditional classification Among the most en- 
during ideas from this influential essay is the proposition that 
traditional classifications are linear, and thus inadequate to 
deal with the many facets and multidimensional approaches of 
modern research. Inherent in much, of the criticism that 
emerged around this time was the assumption that general 
traditional classification was inherently linear, but this was true 
only of our uses of soecific traditional classifications and 
weaknesses In their structure, in any case, in the 1950s the 
world of knowledge seemed so immense, so unstable and so 
complicatCLd that it was widely assumed that no general system 
would ever efficiently serve to provide subject access with any 
precision. So librarians learned to live with DDC and LC, and 
the locus of classification research was not to be found in lib- 
rarianship, but in the information sciences. 



Nontraditional Systems 

The intense activity that has taken place in what may be 
broadly categorized as "nontraditional classification'* can only 
be briefly noted here. The Universal Decimal Classification 
(UDC), which never lacked enthusiastic advocates outside of 
the United States, became the most widely-used special system 
UDC continues to move further and further away from its base 
in DDC (though both systems would clearly benefit if they were 
brought closer together again). Ranganathan became the most 
Influential classification theorist of the twentieth century. Fa- 
ceted classification became a practical reality, and fiundTeds of 
special faceted schemes were constructed. Strongly influenced 
by Ranganathan, the British Classification Research Group was 
founded in the early 1950s, and for the past twenty years has 
been struggling with the problem of finding a means of de- 
veloping a new general classification system. Classification ac- 
quired a whole new vocabulary, with such terms as links, rolls, 
planeSf integrative levels, clumps, and isolates, to mention only 



Throughout these years, theorists drew on widely scattered 
sources, such as systems theory, linguistics and psychology. If 
anything, we learned more about classification than we wanted 
to know. The optimism emerging from the rise of Information 
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science has turned to something bordering on despair Classifi- 
cation, we have learned, is not a physical thing consisting of 
schedules and Indexes, it is a process that takes place in the 
human mind. But nonetheless never in the history of libraries 
have we known more about classification. 

There is no problem in assessing the impact of these develop- 
ments on traditional classification in the United States. The im- 
pact has been almost negligible. Somewhat cautiously, the DDC 
system has taken a few tentative steps towards the addition of 
concepts of faceted classification, though DDC is not and 
probably never will become a faceted classification. The LC sys- 
tem has not changed at all. and as more libraries adopt this 
system, the possibilities of change become more remote. The 
fact is that librarians who have adopted LC do not want it to 
change. The first law of classification dynamics is that the pos- 
sibility of change decreases exponentially as more libraries 
adopt a given system. This law operates whether the changes 
might be good or bad in terms of the purpose of the system. 

The high degree of standardization found In general library 
book collections does not extend to nonbook materials. Here, 
we find a great variety of local systems. The standardization of 
classification of sound recordings, for example, is not even on 
the horizon. Whether this is good or bad can only be answered 
subjectively if costs are discounted. One could argue that in 
organizing these local collections, librarians have a rare oppor- 
tunity to use what they know about their collections, about the 
needs of their library users and about classification actually to 
construct good, working systems ideally suited to the functions 
and capabilities of their libraries. Such opportunities are not 
widely available to librarians who work with large book collec- 
tions. 



• "t- Summary and Conclusion 

The development of applied library classification during the 
past quarter century has been captive to what went before. 
Many historical, economic, intellectual and emotional ties 
bound us to the misty past of the late nineteenth century. If 
some genius had devised a system better than DDC and LC. it 
IS doubtful if the course of history would have been different. 
Even now. if we had a better system— and we could have a bet- 
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ter system if we wanted one — it would probably not be taken 
seriously in the United States, if we have a problem, it is that 
we cannot even conceptualize a better system. 

The future of classification in the United States will be deter? 
mined by what it is that librarians want from a crassification 
system. At the present time, they do not seem to want very 
much, and it may be that their modest expectations are well 
served. Our two systems do seem satisfactorily to serve the 
purpose of organizing materials on shelves. Or at least we are 
convinced that they are satisfactory for this purpose. But the 
conventional wisdom has not been subjected to any extensive 
and rigorous scientific research. The use of a shelf classifica- 
tion is a behavioral process. Something takes place In the mind 
of the user as he contemplates quantities of books on shelves. 
We know almost nothing about this process, and thus have no 
real way of evaluating the efficiency of either DDC or LC. We 
will probably continue to ignore this issue; but an issue we 
cannot Ignore is the interrelationships of the computer, biblio- 
graphical access and classification. 

More than anything elde, our use of the computer witl influence 
the future of traditional classification systems. The computer 
will either stabilize DDC and LC for many generations to come, 
or it will lead to the eventual abandonment of LC. a considera- 
ble reworking of DDC*s notation, and possibly the development 
of a new general classification with extensive national ramifica- 
tions. We have spent millions of dollars constructing networks 
and systems of bibliographical access based on computerized 
data bases. We have done this precipitously and with a 
singlo-mindedness of purpose that has failed to take into ac- 
count the total implications of the enterprise. Not only has an 
inefficient and illogical system of subject headings been per* 
petuated on the MARC tapes, but each year thousands of titles 
are entered into this system tagged with LC class numbers 
which are almost completely useless in providing subject ac* 
cess through a computerized classified catalog. At the same 
time, in order to take advantage of the economic savings to be 
derived from networks and centralization, hundreds of libraries 
are switching from a system which shows some real potential 
for new modes of classified bibliographical access to a system 
with a nonhiorarchical notation which is hopelessly antiquated 
for computerized retrieval systems. 
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Looking to the future. DDC has the capability of developing into 
a system that can exploit some of the potentials of the compu- 
ter and at the same time provide a system of class numbers for 
shelving materials. The LC system, on the other hand, can 
probably continue to expand internally and provide a system of 
shelf numbers for the next fifty or more years. If this is what 
librarians want, and if it should come to pass, classification in 
the United States will, for all practical purposes, remain on the 
fringes of bibliographical access. Firially, in considering tradi- 
tional classifications in both their broad historical context and 
m the complex world of today's libraries, one gets the uncom- 
fortable feeling that we use these systems, not because they are 
the best or most efficient systems or even because we under- 
stand or like them very much, but simply because we are stuck 
with them. 
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The "order of the sciences the order of things"' Is the back- 
bone of traditional classification schemes which have been 
used as Instruments for bibliographic organization. On this , 
premise emerged several concepts which have influenced the 
development of classification systems before the twentieth cen- 
tury* These concepts are: 






1 the hierarchical order 






2 the concept of classification for universal use 






3 the onumerative system 






The weaknesses and inadequacies of these schemes are attri- 
buted to the fact that their structures are derived from 
nineteenth-century principles of class logic rooted in the works 
of Plato and Aristotle. The history of the "grammar of classifica- 
tion" belongs to philosophy rather than librarianshlp. Despite 
the baffling contradlctlonf* which ensued regarding the "order 
of the sciences," It Is advisable to look at some philosophical 
systems that have influenced bibliographic schemes, either di- 
rectly or indirectly, to examine their (deficiencies and determine 
why these nineteenth-century schemes are no longer adequate 
tools In the organization of materials and Information In present 
day libraries. This will also provide students and practitioners 
with a conceptual framework that would assist them in under- ] 
standing and synthesizing perspectives for classification (, 
schemes used In libraries. 
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Historical Prelude 

Historical insight in the classification of the sciences* is pro- 
vided by philosophers such as Plato., Aristotle, Bacon, Comte 
and Spencer. The table below is a schematic comparison of the 
different classical theories of the classification of the sciences 
according to the philosophers cited. 



Table 1 

Classical Theories of Classification of the Sciences 



Plato Structure of the World 

(4th century B.C.) of Forms 



Arlstotit Imitates nature 

(4th century B.C.) 



Bacon 

(17th ceniury) 



Comta 

(18th century) 



Spencer 
(19th century) 



Springs from one root 
and originates from the 
dominant faculties- 
Memory, Imagination, 
Reason 

Staircase Hierarchy: 

Morals 
Sociology 
Biology 
Physics 
^^tronomy 
Mathomatics 

Abstract Sciences: 
modes under which we 
perceive 

Concrete Sciences: 
groups of sense 
impressions 



Collection and Division: 
Classify forms according 
to organized groups, as 
indivisible species, and 
in turn under genera 

Doctrine of Prcdicables: 
Natural grouping of 
things according to 
structures ^nd processes 

Tree System: 
Branches of a tree that 
meet in one stem 



Law of Filiation: 
Decreasing generality to 
increasing complexity: 
complex dependent upon 
those that are simple 



Classification Hierarchy: 
Parallels Bacon's "tree 
system*' and rejects 
Comte's "staircase hier- 
archy of knowledge" 



*The term sciences is used in its unrestricted sense. It claims the whole 
range of phenomena, mental as well as physical^-the enttre universe is 
Its field.2 
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Plato's Collection and Division 

Plato's work on the theory of knowledge set forth the concept 
of knowledge as a priori and the deductive system of proposi- 
tions dominated seventeenth-century thought and flourished in 
the nineteenth century. Plato advocated clear thinking, in terms 
of sharpfy defined abstract concepts. His theory of classifica- 
tion is reflected in his analysis of the structure of the world of 
forms.^ 

Aristotle's Predlcables 

The doctrine of predlcables is the classification of conceptual 
relationships between a subject and its predic^ites. It is also re- 
ferred to as Aristotle's doctrine of the categories— substance, 
quantity, quality, relations, place, time, position, state, action 
and affection. One recognizes from Aristotle's works some kind 
of overall classification of animals. Of course, there is the fa- 
mous Tree of Porphyry which is a representation of the hierar- 
chy of nature as Aristotle saw it. 

Bacon's Intellectual Globe 

Bacon in 0^ Dignity and Advancement of Learning has outlined 
a revolutionized classification of the sciences. In this work he 
reviewed the unchartered fields of knowledge and proposed a 
new classification of the sciences which is to supersede that of 
Aristotle. Bacon's plans for the advancement of learning in- 
cluded not only a reclassification of the sciences but also a 
reorganization of the divisions of human learning. Human learn- 
ing emanates from the three dominant faculties of the 
understanding— memory, imagination and reason. This formed 
the basis of his analysis of knowledge. 

"The divisions of knowledge." Bacon writes, "are not like sev- 
eral lines that meet in one angles but are rather like the 
branches of a tree that meet in one stem."^ paeon's classifica- 
tion, particularly his analysis of history and sociology, has influ- 
enced the scheme of Spencer. The idea, common to Bacon and 
Spencer, is that the sciences spring from one root and branch 
o<J while Comte sees it as a ^'staircase hierarchy."* 

^ ^ 

•Comte asserts that for us to reach the supreme morals as soon as 
possible it is necessary that the study of each science is limited by the 
requirements of the one next above it. 
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Comte's Law of Filiation 

In Comte's system of 'positive philosophy. ' the Law of Filiation 
Is associated with the Law of Classification. It determines the 
order of development by decreasing generality or by increasing 
the complexity of the phenomena---the more complex 
phenomena being dependent upon those that are simple. His 
•^'stajrcase theory of the hierarchy of knowledge,"^ outlined in 
an elaborate scheme, is historically interesting but wanting 
from the standpoint of modern classification. However Comte is 
a link between Bacon and Spencer, for his writings on the Law 
of Classification of the sciences acted as a catalyst to 
Spencer's thoughts on tha classification of the sciences.® 



Spencer's Ciassification 

Spencer's classification of the sciences parallels Bacon's con- 
cept of the sciences which is analogous to the ''branches of a 
tree spreading out from a common root. * He rejects the stair- 
case arrangement of Comte's hierarchy. His classification com- 
bines the "tree" system of Bacon with Comte's exclusion of 
theology and metaphysics from the field of knowledge. It pro- 
vides builders of classification schemes an excellent starting 
point. 

In the preceding discussion of five philosophical systems, it is 
quite evident that the theory of classification is closely linked 
with the concept of the "universal order of things and ideas." 
The question thus arises. Is there such an 'order? If so. what is 
tne nature of this order? In the analysis of the processes in- 
volved in the classification and arrangement of things and 
ideas, one finds that the two processes complement each other, 
i.e.. the former refeh^ to the problem of sorting or grouping, 
whereas the latter addresses itself to the problems of unity, or 
the assembling of parts to form a whole. 

The polemics that have gone on concern the problem of the 
order of ideas and things in structures, such as, order of com- 
plexity or order by class logic or order of power. Uporl exami- 
nation of the development of certain phenomena, one is bound 
to find that ideas reflect the evolutionary stages they go 
through /n time from the simplest to the most complex. 
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Classification of the Sciences as a 
Model of Bibliographic Organization 

An examination of library classification schemes of the prefa- 
ceted era would reveal a close analogy to the classification of 
the sciences which was advocated from the time of Plato to 
that of Spencer. The method used in constructing the schemes 
is deductive. Traditional classification begins with the assump- 
tion that classification is a process of division applied to a uni- 
verse of knowledge. This universe is fragmente.d in stages by 
the application of various processes of division, namely: 

1 Logical division 

2 Physical division 

3 Metaphysical division 

One of the most fundamental divisions is the genus-species re- 
lationship. This is achieved by the classical method of logical 
division found in philosophical charts of learning, wherein all 
main classes spring from the traditional discipline? of knowl- 
edge. In physical division the parts of which an individual thing 
or aggregate is composed are distinguished— as in man- head, 
limbs, trunk, etc.; in a flower: sepal, petal, stamen, pistil, etc. In 
metaphysical division we distinguish a species in its genus and 
differentia, in a substance. Its different attributes, in a quality. 
Its different variables or dimensions— thus, in man: animality 
and rationality; in sugar, color, texture, flavor, etc. Obviously 
metaphysical division can be carried out in thought alone In 
logical division, when the genus is concrete, its individual 
species can be exhibited In a museum case, likewise in physi- 
cal division, the parts of an Individual animal or plant may be 
separated physically, but in metaphysical division the parts 
cannot be displayed separately, e.g.. taste or texture in salt can 
never be exhibited by itself alone.^ 

The deductive approach to such classification structure is 
based on the general assumption that the sum total of know[- 
edge is arbitrarily divided into a number of main classes which 
are. in turn, subdivided into subclasses and so on down to a 
point where an infima spocies (irreducible unit) is arrived at. 

The charactenstic which dominates traditional classification 
schemes is the logical orde" of entities. This is accomplished by 
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grouping entities according to tlie degree of Ukeness or similar- 
ity, ihsn arrsriym^ ihciTi frOiTi cOiMplsx to siiTiplo. Such 3 struc^ 
ture depicts a liierarcliy of entities^the scheme order being 
that the genus and species follow a downward process until a 
unit in the hierarchy is irreducible (cf. Plato's method of Collec- 
tion and Division) as opposed to the upward process (inductive 
approach) employed in modern classification schemes. 

For our purposes we will use four general systems of book 
classification to illustrate the value and application of the clas- 
sical concepts of classification as an instrument of bibliograph- 
ic organization. These schemes are: Cutter Expansive (CE), 
Dewey Decimal Classification (DDC), Brown^s Subject Classifi- 
cation (SC) and the Library of Congress Classification (LC), all 
of which manifest a close parallelism to philosophical classifi- 
cation systems characterized by a hierarchical structure, follow- 
ing the basic rules of the various processes of division. 

Two of the schemes (DDC and LC), despite their uneven de- 
velopment and imperfections in their logical arrangement, nota- 
tion and linear representation, are very much in use today. All 
four schemes are universal in range and scope. The schemes 
are hierarchical in nature, and in theory they follow the basic 
pattern of the inverted tree structure exhibited by taxonomic 
classification systems and are not based on "literary warrant." 

In examining these schemes, one encounters combinations and 
variation of different principles proposed by Individual 
philosophers who have formulated the concept of the classifi- 
cation of ideas. Except for LC (1901*), a product of team effort, 
the bibliographical systems produced during the nineteenth 
century were devised by individuals. DDC (1876), CE (1891) and 
SC (1906**). 

The classification structure is manifested by the formation of 
classes and proceeds to sort out the subclasses or individual 
members of the class by enumerating the attributes or proper- 
ties which differentiate one entity from the other. Thus the class 



*LC's Class Z. Bibliography and Library Sciences, was completed in 
1898. 

"Work on Brown's Sgbject Classification began in the last decade of 
the eighteenth century. 



Traditional Classification 



27 



0) 

o 
c 
o 

CO 

15 
u 



O 



>» o 

Q. < 

Q 
I 



0) 

o 
u 
c 

C Q 
.9 o 

© 75 
^ cc .S^ -. 

:= o 
jr o 
Q. CO 



0) 

c 

^ TJ O) 
m ® O 

o .- 
CO CD 

75 

.Si o E 

0£ 8 
00 LU LU 

a. X- 5^ •J 
ui6 



<D 

B ^ 

(Q Q. 

0 S 

-J CO ~ w 

1 .-2 .2 .2 
J _I I 00 

o 



0) 

E 
o 
i= 
u 
CO 

c 
o 

CO 

o 

*(/) 
(/) 

iS 
O 

O 

o 
00 

3 
O 
U. 



T3 

0} 
(/) 

(/) 

(/) 

o 

c 
'5 

«•» 
o 

o 

CO 

Q} 
> 

CO 

Is 



c 
o 



0) 

^06 



8 o 



Q. 

o 



CO 
U 

c 



I 

S 

«e5 



CO 



>, CD TO 

C3 o S> .S5 

_ _ S o o o 
^ CO Q. X CD CO CL 

O < « ^ O X -5 



c 

•2 r 

> ^ CO o 

-I LU 2 u. 



g) 3 
3 Si o =J 



(0 

c * 

.2 2 

U (D 
CO o 
£*C0 

iS 75 

!l 



o 
u 
c 
.2 
"o 
CO 

CO 



C O 3 Q) <5. 3 (0 

CO £ CO Q) .E CO .ti 

Z 5 3 CC U- -I -I 




CO 
0) 

o o 

o fo o "5 g < D 



X >• 



9 JZ Of o JZ CD ^ 

CDCLCCCOCLZpU-S-lX 



ERIC • 



32 



28 



Traditional Classification 



Berries produces the species Cane fruits, Ribes, Huckleberries, 
etc., and under Cane fruits, the species Raspberries, Blackber- 
ries, Loganberries, etc. Hierarchies are thus created, based on 
successive application of characteristics. The Law of Likeness, 
whfch is the fundamental principle of the order of things, is 
employed. Ideas arranged according to likeness determine the 
order. In the main class concept, a.generalia class isiprovided 
to accommodate materials treating a variety of subjects or sub- 
jects which are too general in nature to go to any other class. 
This class generally precedes the inclusive classes for the 
whole system, or it may be located within the subdivision of 
each 6\ass. 

Division and subdivisions in these systems are arbitrary separa* 
tions of closely related main classes. For example, one finds in 
CE, DDC and LC that the sciences are separated (e.g.. Physical 
Sciences form a class apart from Technology, and Fine Arts 
from Useful Arts)" On the other extreme, SC collocates Music 
with the Physical Sciences under Acoustics, thus stretching 
theory beyond practical considerations. Table 2 illustrates this 
characteristic. 

For literary works, the arrangement is by national origin, genre 
and chronological sequence, except in SC which abandoned 
this concept in favor of four form divisions and alphabetically 
listing all individual authors, regardless of national origin or the 
period in which the work was written, under each literary form. 

In the philosophical system of classification, only single con- 
cepts are included, e.g.. Forest exploitation, Forest utilization, 
whereas bibliographical classification deals with compound 
concepts, e.g., The effects of government regulations on forest 
exploitation and utilization. Traditional classification attempted 
to cope with this problem by listing every possible concept that 
occurs, simple and compound, thus creating an enumerative 
schemes. Needham^ states that the enumerative approach 
failed for the following reasons. 

1 Enumeration can never be complete. 

2 The theoretical application of class logic can be carried too far 
beyond practical reality, causing confusion because certain 
entities do not fall under any generic hierarchy. 

3 Cross-classification occurs because there is an overlapping of 
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attributes in an array of classes and compound subjects are 
presented as if they were simple subsets of the preceding 
subject. 

Traditional classification employs some form of notation, either 
pure or mixed* Nb^ation is merely a coding device which dis- 
plays the order of entities in a scheme and facilitates the 
mechanical arrangement of materials on shelves. The problems 
associated with notational requirements will be dealt with later. 

^ Concomitant to the employment of notational devices in clas- 
sification schemes is the introduction of synthesis and 
mnemonic features. Such features are exhibited in various 
ways, particularly: 

1 Number-building devices which take the form of common 
tables for standard form divisions and geographic or area 
tables as Illustrated in CE and DOC. and the categorical tables 
of recurring elements in SC. LC assigns each subject a set of 
standard subdivision and area tables. 

2 Mnemonic features are introduced by means of auxiliary tables 
listing constantly recurring categories. Each of these categories 
is consistently denoted by the same notational symbol, thus 
enhancing the memory value of the notation. 

In reviewing the principles of dividing knowledge set forth by 
classical philosophers, Plato's concept of st?ucture of the world 
of forms and Comte's Law of Filiation have played an important 
role in the hierarchical features evident in traditional classifica- 
tion schemes. The ordering of concepts according to their de- 
giee of likeness, arranged in an evolutionary form from the 
simple to the more complex or vice versa, is the Baconian in- 
fluence but probably more of the Aristotelian doctrine of pre- 
dicables. 



Problems 

The originators and proponents of the traditional system in 
adopting the classical concepts worked out certain practical ad 
justments in the operational level of the schemes. These were 
made necessary because the implementation of the system 
called for a functional organization. More importantly, collec- 
tions in libraries required the display of relationships not only 
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by classifying similar items but by integrating those relation- 
ships that showed the effect of one class upon another. How- 
ever, the classification in the traditional system excludes the 
latter. A problem in this approach surfaces. A philosophical sys- 
tem encompassing universal knowledge is inadequate as a 
model in devising a classification system which deals not only 
with complex concepts but also with the vehicles that transmit 
them. 

The publication and dissemination of materials in a variety of 
subjects and physical formats are continuously increasing at an 
exponential rate. The inevitability of future discoveries and ex- 
plorations renders the universe of knowledge continually 
changing in quantity. For effectiveness, a classification scheme 
derived deductively depends upon the invariability of the as- 
sumed sum total of knowledge. In effect such a scheme would 
require continuous revision and updaUng in order takeep 
abreast with tfip state of the sociology of knowledge. Thus a 
permanent complete scheme covering the whole field of knowl- 
edge is still an impossibility. 

What needs to be understood Is the fact that the deductive ap- 
proach lacks the flexibility to accommodate new subjects 
-whenever they occur sans revision. An enumeration of all sub- 
jects within a class or set of subclasses is nearly an Impossibil- 
ity. Furthermore, such enumeration is compounded by the 
problems associated with classification schemes that are uni- 
versal in range and scope of their applicability. To produce a 
universal classification of knowledge, one must, theoretically, 
have all knowledge avail^^ble. In reality only representative sam 
plings of the different branches of knowledge are covered in 
universal classification schemes. It is difficult enough to be cer- 
tain that a set of subclasses completely covers the parent class 
and it is much more difficult to ascertain extant classes and 
predict what other classes may be added in the future. 

The ever-recurring problem of synonymous terms and their 
standardization m a universal classification contributes to the 
problem m a scheme dependent upon the state of the theory of 
knowledge and its ramifications. 

Schemes that are enumerative m nature have Tiost of their 
compound subjects precoordinated. The tables of CE, DDC. SC 
and LC attempt to find a place for complex concepts that are 
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likely to change within short periods of time or would vary from 
one document to another. The process of classification be- 
comes, in most cases, nothing more than an exercise in ap- 
proximation. For example, the tables of CE, DDC, SC and LC do 
not include Compound concepts, such as, The production of 
goats' milk cheese or The incidence of asthma in winter of 1962 
or Spring harvesting of wheat. 

The progress of knowledge has contributed to the instability of 
main classes because such categories tend to be names of col- 
lections of ideas that are very much colored by the theory and 
state of knowledge. An entirely different view of life and knowl- 
edge is expressed in classical schemes which give precedence 
to philosophy and philosophical writings, or the Russian 
scheme which gives more importance to Marxism or other 
socialist works. 

Traditional classification schemes choose the major disciplines 
as their main classes. (See Table 2.) Aside from the fact that it 
IS difficult to clearly draw the boundaries between main classes 
and determine the required number of main classes satisfactor- 
ily, thero is the further disadvantage in using such disciplines 
as the summum genus in a scheme. It is often possible to show 
how the entities of one class vary until such entities begin to 
approximate the entitles of another class. Then the suspicion is 
generated that there may be no fixed classes in nature and the 
once obvious differences observed in entities are all products 
of differing environments in which these entities are found and 
through which they have passed. A class organization of 
knowledge which includes concrete and empirical entities fails 
to be wholly adequate because it is incapable of organizing the 
varying charactenstics that develop in entities in varying envh 
ronment. 

Where mam classes are used, the classification scheme must 
provide some rules for establishing order in the scheme. The 
problem of collocation presents another problem since there 
are no restrictions on proximities where schemes are essen- 
tially linear. Another problem relating to the lack of rules is the 
application of the rules of logical division. Logical division does 
not provide rules relating to compound subjects, nor does it 
give rules for the arrangement of classes in an order. In a 
theoretical scheme such rules may be deemed unnecessary, but 
when a scheme is used for bibliographic organization, such a 
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ruletis almost imperative. It Is necessary to have a preferred 
order for arranging physical objects on the shelves or the en- 
tries In a catalog. There are further limitations to the use of log- 
ical division for bibliographic organization. The theory of divi- 
sion breaks down amidst the complexity and variety of concrete 
entities. Ideally when a genus is subdivided into species, 
whether once or through several stages, it is assumed that at 
each stage a number of definite species are Included in that 
genus. For example, in the biological sciences such a division 
Is clear-cut and definite. But In other classes, for the most part, 
it Is not possible to expect entities to fall into the genus-species 
relationships which would fit into the perfect structure of a log- 
ical division. Neither would it be possible to completely exhaust 
the parent class or to enumerate all the individual members of 
the class that already exist or may be discovered. The most 
serious limitation to logical division is that it only deals with 
one kind of relationship, that of a thing and its kind, known in 
scientific jargorr as genus an6 species. In library classification 
we are concerned with several other types of relationships, 
therefore it is necessary to apply other types of division in a 
classification scheme developed for use in libraries. The gen- 
eral relationship between genus and species is particularly hot 
applicable to the organization of relationships involving spatial 
position. In addition, it Is less possible to represent in class 
logic that part of the empirical sciences which deals with the 
continuous or discontinuous alteration of behavior of specific 
entities evolving from the changes in their environments. 

Traditional schemes are further besieged by the problems of 
notational requirements. Schemes which were developed origi- 
nally for the classification of ideas are used to classify ideas 
contained in a physical object. In arranging books in libraries, 
we are forced to consider their physical characteristics, thus 
requiring modification of any pure knowledge classification 

The interpolation of a notation adds to the confusion. The nota- 
tion, which serves as a location symbol, becomes a code rep- 
resenting the natural language. Notation introduces some un- 
desirable inflexibility to a schema and is "still-born" since it is 
incapable of growing. Notation is a necessity in a classification 
scheme since It displays and preserves the desired order of en- 
titles and acts as a shorthand code for the natural language. 
Since Its function is to preserve order It is obvious that a nota- 
tion must make provision to Include new subjects as they arise 
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The quality needed for this aspect is referred to as hospitality 
To accommodate developing subjects, expansion is provided by 
reserving large blocks of unasslgned numbers. Since there is 
no way of prejudging what subjects will develop either rapidly 
or slowly, faulty apportionment of notation arises. Many Impor- 
tant classes are not developed sufficiently enough and there is 
very little room for expansion, or in some instances, one ends 
up with unwieldy, lengthy notation. On the other hand, subjects 
of very limited significance have comparatively large blocks of 
numbers assigned. This prol^lem is closely tied to the fact that 
traditional classification schemes were constructed on the a 
priori basis of class division, with the exception of LC which 
was based on literary warrant. In most schemes one will find 
classes that are well developed and some classes that are not 
represented by any single publication, or if at all. very few in- 
deed. 

If knowledge were static, notation would not be a problem 
since it would be easy to add the notation to the scheme after it 
18 completed. But since new subjects are created within classes 
or divisions, it is imperative that such new additions be located 
in their correct place within the scheme. If the notation is in- 
flexible, then it will dictate the order, thus preventing its effec- 
tive use. Notation does not improve the scheme per se but Is a 
necessary evil in a working classification. 

Adaptability to machine techniques requires that a scheme 
should have the facility to express generic relationships if 
hierarchical searching is required. Relationships among com- 
pound and composite subjects need to be made explicit 
through the use of notational symbols. In machine searching, it 
IS necessary that concepts be associated consistently with one 
unique code. This process would be difficult and expensive to 
achieve in any traditional scheme based on main classes, since 
the notation which represents any particular concept keeps 
changing according to the class from which it Is derived. On 
the other hand, the use of a notation that expresses hierarchi- 
cal structure which Is effective for machine storage and re- 
trieval could be exploited so that the genus-species relation- 
ships could be displayed by lengthening or shortening the nota- 
tion representing the concept. Thus the user would be able to 
broaden or narrow his search at the level of any particular ele- 
ment in a compound subject. 
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An enumerative scheme fed into a computer will not allow re- 
trieval of a particular element. To be effective each class 
number needs a code representing only one subject and used 
consistently for that subject. Except for tables of standard sub- 
divisions and area tables, this is not the case with traditional 
classification scheme:; created before the faceted approach was 
recognized. 



Summary and Concfusion 

This paper attempted to show that a classification scheme 
should not be evaluated on the basis of its completeness or 
"neatness" alone, but also on the extent to which it advances 
knowledge and achieves the purpose for which it was originally 
created. 

Traditional classification schemes have proved inadequate as 
instruments of bibliographic organization in the face of the 
ever-expanding field ot knowledge and of technological de* 
velopmentSp particularly computers. The schemes are hand- 
icapped by limited recall capabilities due to the dispersal of re- 
lated aspects of entities inherent in enumerative schemes and 
In one-dimensional linear classified arrangement. From them, 
however, certain fundamental principles, theories and concepts 
of the organization of knowledge have emerged which are cru- 
cial to the development of modern classification schemes. In 
1970 Foskett wrote: 

In our technique for information control the time is ripe for 
the overthrow of existing paradigm, but we should not, at 
the same time, reject those aspects of it that can usefully 
contribute, foi what we need now is not a blank slate, as was 
once thought, but a genuine synthesis.^^ 

The problem of terms and their standardization m a classifica- 
tion for universal use can only be resol.cd by an overall accep- 
tance of a single authority. Such standardization might be dif- 
ficult to achieve since there is no such thing as an "all-around*' 
view of the world. People's perception of reality is conditioned 
by the constraints of their cultural orientation. If we are seeking 
to accumulate a store of knowledge that may be employed in 
an eclectic fashion, we should strive to eliminate all vagueness. 
Furthermore, we usually believe that nature is not vague and it 
should follow that knowledge of nature should not be vague. In 




Traditional Classification 



35 



practice this vagueness cannot be eliminated. However, it can 
be reduced. The particular kind of organization that traditional 
classification schemes give to knowledge make it especially dif- 
ficult to eliminate vagueness of connotation and denotation to 
any desired degree. 

The rapid advancement of knowledge requires that schemes 
undergo frequent revisions and updating, even in areas where 
knowledge remains unchanged. Revisions will still be necessary 
but will take the form of extensions. In view of this, it is impera- 
tive that librarians become ^'independent classifiers.'* This 
means that librarians should have complete understanding of 
the principles, theories and concepts of classification so that 
they are m a position to amend, modify or revise any classifica- 
tion scheme within the.normal limits of human error. 

j 

Although classification is a matter of picking out and concefi- 
tually grouping together certain entities of a heterogeneous 
field, it should be remembered that in the process its grouping 
of entities interrupts and disregards relationships between en- 
tities that fall into different classes and overemphasizes rela- 
tionships between entitles that fall into the same class. And this 
IS especially true with schemes developed deductively. In effect 
we want a scheme that will reflect class organization and at the 
same time reflect cross-class relationship. A functional organi- 
zation can preserve better than class organization specific vari- 
ations in entities, and it would be foolhardy to sacrifice this ad- 
vantage by allowing "likeness*' to absorb or displace dis- 
similarities in classification. For knowledge may reflect a 
knowledge of a class of entities used as a justification for a par- 
ticular classification and as an explanation for the fact that 
members of the same class behave differently. But it is not a 
knowledge organized exclusively by class relationships. Even If 
knowledge is about members of a single class, it contains ref- 
erences to entities belonging to other classes, thus it should be 
an organization in ♦«»''ms of relationships that cuts across 
classes. Futhermore such relationships should have the 
capability of being machine-manipulated and retrieved in a var- 
iety of ways at every accessible point. 

A classification scheme must not arbitrarily group the materials 
of experience into few classes. There may be major classes, but 
there must also be numerous subclasses equipped witn cross- 
classification mechanisms. 
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Classification schemes that are closely associated with 
philosophical system^ have a strong tendency to be regarded 
as either "naxural" or '^artificial/' which Is perhaps distortive of 
reality. For man to be fully satisfied with a classification system 
he needs to become aware of his own classifying activity and 
consciously to strive to control and master it. And this control 
and mastery are best exercised in a purposeful manipulation of 
classificatory concepts, with full awareness of the various ways 
In which complex entities could be classified and of the needs 
which any desired classification schemes must satisfy. 
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It must be clearly stated at the outset that this is not a review 
paper. Instead, 1 have taken this opportunity to present personal 
opinions regarding the role of certain types of classifications m 
our modern automated environment. Although there may be 
some statements the reader mayjyjt agree with, they will hope- 
fully be offset by other concepts which do have merit and can be 
used by the reader to improve operational mformatton systems or 
be incorporated into plans for new systems. 




Classification As Used in this Paper 




Since the word classification can be used in so many different 
ways, it is essential to indicate that in thi^ paper classifica- 
tion refers to highly structured, hierarchical classifications 
found on the far right m Figure 1. This figure shows a spectrum 
of sources of index terms, concepts and/or notations used m 
various types of indexing and retrieval operations. 




uncontrolled -^alphabetical ^three level ^word Trees" *multllevel 
vocabularies subject thesauri hierarchical 

authority classifications 

lists 




Figure 1 




In general, classifications are used to organize concepts (often 
expressed as single words or short phrases) in a logical, sys- 
tematic fashion and to show relationships between concepts.^ 
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They are created by grouping concepts that share similar 
characteristics, particularly those charactenstics that are most 
significant for meeting the anticipated retrieval needs of a 
known user group. 

The most useful classifications from a user standpoint are 
those that reduce the effort needed to retrieve precisely the 
needed items of information at exactly the level of detail re- 
quired. Ideally, the classification should permit the user to 
select one or two categories containing all the required infor- 
mation instead of having to identify many separate concepts 
and link them together by a complex search strategy, often re- 
quiring several revisions, before the desired information is re- 
trieved. The performance of a classification, then, is mainly a 
function of how well the developers of the classification have 
forseen the need of the users and grouped concepts together 
for users at the multiple levels of generality and detail likely to 
be needed by most users. 

The preceding statement suggests that the performance of an 
information retrieval system can be improved by mcving from 
left to right in Figure 1, which is arranged in order of increasing 
degree of organization and increasing delineation of the rela- 
tionships between concepts. In actual fact, Information systems 
that started with sources of concepts to the left of center in 
Figure 1 have generally been forced to move to the center or 
right-hand side of Figure 1 in order to improve performance 
This left-to-right shift has been well described by Lancaster.^ 
who also points out that the thesauri are actually a rather ex* 
tensive but covert or hidden classification. 

When systems progress through the sequence shown in Figure 
1, they seldom take the ultimate step which involves the de- 
velopment and use of a multilevel hierarchical classification 
and a hierarchical notation. It ts because this type of classifica- 
tion has such tremendous potential for improving the perfor- 
mance of Information retneval systems, and because it is so 
seldom used, that I have chosen to emphasize it m this paper 



What is a Modem Classification? 

To be truly modern' a classification must be (a) free of con- 
straints associated with many existing traditional classlfica- 
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tions so that it can be easily and frequently revised to keep it 
up-to-date, and (b) structured and used in a way that takes full 
advaiitage of the capabilities of computers and computer 
systems. 

PuW freedom from constraints is most nearly achieved when a 
. "modern" classification is developed de novo. In other words, 
there is little chance of success in undertaking the thankless, 
very difficult task of trying to modernize a classification that is 
hopelessly out-of-date. The existence of highlevel committees 
who must approve each change, no matter how trivial or obvi* 
ous, all but guarantees failure of any attempt to keep a classifi* 
cation in a fluid evolutionary state with frequent modification in 
response to changes in Information being indexed. 

If It IS really necessary to have classifications which are used 
with only minor variation to arrange documents on shelves in 
hundreds, if not thousands, of libraries throughout the world, 
(and space does not permit me to do more than suggest that 
this may no longer be necessary), then it is clearly desirable to 
keep changes in the classification to a minimum, and moderni* 
zation may actually be undesirable. This paper does not deal 
further with this difficult dilemma. 

Instead, I am dealing with those situations where it is possible 
to take a fresh or a new look at a limited subject area and 
create new or extensively revised and open-ended classifica- 
tions which conform as closely as possible to current thinking, 
current terminology and the present conceptual framework of 
those who work in the subject area. These modern classifica- 
tions which cover selected subjects in considerable depth 
(rather than the near-universal and sometimes superficial 
coverage of some traditional classifications) are normally used 
in one rather centralized system by only a small group of in- 
dexers. The retrieved information is usually only a citation or ci- 
tation plus abstract or some other document surrogate which 
does not need a unique classification number for determining 
Its physical location on a shelf or in a file drawer. In these 
cases, the classification can be virtually constraint-free, which 
IS the fir^t criteria of a truly modern classification that will re- 
main viable and useful for an extended period of time. 

The movement away from the need for a one-to-one correspon- 
dence between a document and a notation derived from a clas- 
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sification also leads directly to the second criteria for a modern 
classification. In modern systems, classifications should be 
structured and used as sources of multiple categories ttiat are 
independently assigned to eacfi document and are manipulated 
through simple Boolean logic m computerized systems to re- 
trieve only those documents assigned to any desired combi- 
nation of categories. This use of multiple categories from 
a classification is exactly the same process as selecting multi- 
ple descriptors or keywords from a list of subject headings or a 
thesaurus. The difference, as described in more detail later, is 
that the categories from a classification can frequently be much 
more powerful descriptors than isolated keywords or phrases 

This multiple assignment o' categories from a modern classifi- 
cation to each document is clearly different from \he use of 
more traditional classifications to assign a single unique 
number to a document. 



The Value of Modern Classifications 

The problem of GIGO (garbage in, garbage out) in information 
systems can best be soued by thorough analysis and organiza- 
\\qn of the information as it is entered into the system. Assign- 
ment of categories from a modern classification to each en- 
tered data Item specifically identifies that item by placing it with 
a group of other items having nearly identical characteristics. 
Because of its location within the hierarchy, this process also 
relates the new data to other data items already in the system 
As a result, the retrieval of very 'clean" data is greatly facili- 
tated. 

Use of a modern classification to analyze and organize data at 
the time of input is particularly valuable in the "soft" areas of 
science, such as social science, political science, the human- 
ities, and other subject areas where concepts are not closely 
linked to specific technical terms and are often deccribed in 
wordy and imprecise phrases or in jargon that has meaning 
only to a small in-group. 

In the physical sciences indexing problems arise more from the 
very large and increasing number of highly specific technical 
terms which must all be included in a search for information in 
a given area. Again, assignment of categories from a modern 
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classification automatically places new input into small groups 
of closely related information items that can often be retrieved 
as a unit by specifying a single category number. 

The problems mentioned above are compounded by the very 
large size of many data collections, the increasing specializa- 
tion of both data and users and the rapid emergence of new 
subject areas resulting from the interconnection of two or more 
lines of research that were previously distinct entities. 

Again, a constraint-free modern classification can effectively 
handle these problems, tf it is kept up*to-date by frequent sub- 
division of categories to deal with new subspecialties and by 
the frequent addition of new categories to deal with new inter- 
disciplinary topics. No matter how large the collection, the 
specificity of category descriptions, the detailed relationships 
built into the classifications, and the ability to specify the exact 
generic level needed for each search make it possible to re- 
trieve data items in a very narrow subject specialty with high 
precision. When the categories are used in Boolean expres- 
sions, they become even more powerful precision devices. 

At the same time, modern classifications give the user tight 
control of synonyms and near-synonyms as well as related con- 
cepts that can only be expressed by a string of words. This 
synonym control, along with the automatic grouping of closely 
related items by the assignment of categories from a. modern 
classification, enables the user to achieve high recall of all rele- 
vant Items, no matter how diverse the terms used to describe 
those data items. 

In any information system there is an inverse relation between 
recall and precision. However, the preceding paragraphs state 
my conviction that modern classitications can be used to in- 
crease both recall and precision above the levels possible with 
other types of indexing tools. Experiments at the Smithsonian 
Science, Information Exchange ^ 3 have provided some experi- 
me'^ta' verification of this point. 

Structure and Organlzatiort 

Because computers can retrieve any desired combination of 
categories at the time of retrieval, it is no longer useful in mod 
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ern automated systems to precoordinate categories from a 
classification at the time of indexing. Thus, traditional synthetic 
classifications, such as faceted classifications and colon clas- 
sifications (to the extent that these terms suggest the need for 
synthesis or precoordination of different facets or elemental 
categories to form a unique notation at the time of indexing), 
are not truly modern classifications and are not well suited for 
use in modern automated systems. 

Instead, to be fully effective, a modern classification should be 
enumerative rather than synthetic. That is, it should have a 
deep, multilevel, open-ended hierarchical structure which lists 
or enumerates all the unique concepts needed for indexing 
. data at all levels of detail likely to be needed by the user. What- 
ever precoordination of words and phrases is desirable for 
identifying basic concepts should be built into each category as 
the classification is developed. 

If this is done well, most of the concepts needed for indexing 
or retrieval will have a one-to-one correspondence with 
categories in the classification, and the need for post coordina- 
tion will be greatly reduced. The resulting classification, struc- 
tured along the lines just suggested, contains within the defini- 
tion of each category a high level of "judicious precoordina- 
tion*' which Lancaster points out is useful for reducing the 
problem of noise in Information systems. Ways to build pre- 
coordination into categories will be briefly outlined in a later 
section. 

I have suggested elsewhere* that the acronym HICLASS be 
used to describe enumerative, "deep" or multilevel H/erarchical 
CLASSifications with extensive precoordination built into the 
categories. The same paper describes how this type of classifi* 
cation was successfully used with no post coordination in a 
system for selective dissemination of information (SDI) based 
on smgle-hit matching between any one of several categories 
assigned to a user and any one of several categories assigned 
to a document. 

Although the SDI system just referenced demonstrated that 
post coordination of categories from a HICLASS type of clas- 
sification IS not essential, a special type of post coordination 
would have improved the matching of users and documents 
About one-third of the documents which users rated as being 
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of no significant interest would not have been matched with 
those users if answers to a few simple questions of the follow- 
ing type had been post coordinated with more substantive sub- 
ject categories: 

Does the information involve a human patient? 

Does the experiment involve human tissue? 

Was the experiment performed exclusively in laboratory equip- 
ment and not in a living animal (i.e., was it an in vitro or an in 
vivo experiment)? 

Did the information involve newborn or'very young animals? 

In other subject areas, similar questions might cover geo- 
graphic locations, ranges of years or other periods of time, 
anatomical sites, etc., if these aspects of the information were 
not already used as major categories. The questions themselves 
can take the form of a simple checklist which supplies special 
tags that can be checked off for both users and documents at 
the time of indexing and used as a form of post coordination at 
the time of retrieval. 

In modern computer systems, answers to many of the types of 
questions just listed are best handled as a short string of bits in 
the computer record. Each bit In this bit string can be turned 
pff or on, depen. ng on the answer to a corresponding ques- 
tion or item in the checklist. Screening of these bits to make 
sure they match the user request is a simplified modern form of 
post coordination. A modern classification should be structured 
and organized to take full advantage of this abi!ity^t0*dse the bit 
screening capability of computers. 



Precoordination and Post Coordination 

The best modern classifications probably fall somewhere be- 
tween the faceted or colon classifications (which require exten- 
sive coordination of categories to define concepts needed for 
retrieval) and the enumerative, deeply-detailed, multilevel 
hierarchical classifications of the HICLASS type which have ex- 
tensive precoordination built Into the categories, and conse- 
quently need only minimum coordination of categories. It 
should be stressed again that coordination of categories in*a 
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modern system implies use of the computer only forposf coor- 
dination at the time of retrieval, and not precoordination at the 
time of indexing. 

A modern classification can be used to achieve the optimum 
balance between the amount of precoordinatiOn built into 
categories and the amount of post coordination required by 
system users. I have suggested elsewhere^ that it is much better 
from a total information system viewpoint to tip the balance far 
toward the side of precoordination when the classification is 
developed. This is, accomplished by having subject experts and 
potential users devote considerable time, effort and thought to 
building an enumerative classification with categories that fully 
describe each concept (with precoordination of all its compo- 
nents) and a deep multilevel hierarchical structure that clearly 
and accurately relates each concept to other categories in the 
classification. This operation (apart from revisions and updat- 
ing, which all systems require) is performed otily once by a very 
few experts. 

In contrast, a number of existing systems now have hundreds 
of on-line users, often with very limited knowledge of the sub- 
ject area, who make thousands (and for some systems, several 
hundred thousands) of searches each year. The development 
and use of a good modern classification is easily worth the ef- 
fort, if it results in a saving of even a few minutes per search, 
even a slight increase in the recall or retrieval of useful infor- 
mation and a slight decrease in the "noise" or an increase in 
the precision of the retrieved information. These small savings 
by system users, multiplied by the number of users, the number 
of searches and the number of years of use, should more than 
offset the one-time cost and effort of building and using a good 
modern classification for structuring, analyzing and organizing 
the massive amounts of information in contemporary informa- 
tion systems. 

Notation and the Index for Modern Systems 

Two requirements for a modern classification system which 
have not yet been mentioned are, (a) a notation system that is 
maximally useful in computerized searching and (b) an exten- 
sive alphabetic index which serves as a lead in" vocabulary or 
"entry" vocabulary. 
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Smce the index has no special features. It need not be discus- 
sed in detail other than to Wss that it Is essential for both 
users and Indexers and thatNio classification Is likely to be of 
much use without it. The inde^ serves to lead "naive" users 
from a simple term In the alphabetic index to a sophisticated 
concept surrounded by related concepts In the hierarchical 
classification. In this way. the logrcal thought processes and re- 
lationships of categories built into the classification by subject 
experts are used indirectly In every retrieval operation, thereby 
upgrading the operation, no matte/ hcjw Inexperienced or how 
lacking in understanding of the subject the searcher may be, 
This results ip a considerable upgrading of the content and 
usefulness of the retrieved data In most searches. 

The word notation (sometimes called class numbers, which in- 
correctly implies that only numbers are used in the notation) re- 
fers to a string of characters used to uniquely identify each 
category In a classification. The notation makes it possible to 
use an unlimited number of words, phrases, synonyms, near- 
synonyms and variations in spelling or plurality precisely to de- 
scribe the conceptual content of each category In the classifi- 
cation. 

For the type of modern hierarchical classification I have advo- 
cated in previous paragraphs, the notation must also be hierar- 
chical in order to reflect the structure and organization of the 
hierarchy. In other words, the notation for all subdivisions of a 
major category must be a meaningful, expressive notation 
rather than a meaningless string of characters. For example, 
notations 51.83. 51.832. 51.8345 and 51.83FT4 Identify four 
categories which are all subdivisions of category 51.83. which 
in turn is a subdivision of category 51.8. which Is a subdivision 
of the major category 51.. etc. It is highly undesirable to use 
periods to set off each new number added to the notation 
(compare 51.83FT4 with 51.8.3.FT.4) since this adds extra and 
unnecessary characters and is more difficult to manipulate both 
manually and by the computer. 

The advantage of this hierarchical notation is that the indexed 
Items assigned to very specific categories (for example. 51 8345 
or 51.83FT4) are clearly linked by the notation to every category 
that Is more generic (51.. 51.8. and 51 83). In this way. every 
character m the notation has meaning and reflects both subject 
content and precise relationships between categories in a very 
compact, concise way. 
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This type of notation permits users to select a notation at any 
desired generic level and let the computer identify and retrieve 
all subcategories of that category using only the notation as an 
instruction. No other automatic position, generic-to>specific 
posting referral process or mapping procedure is necessary for 
the automatic retrieval of all subcategories subsumed under the 
major category specified by the user. 

Although the notation m a modern classification can contain 
letters or any other characters acceptable to the computer, it is 
best to use numbers, since even long strings of numbers (i.e., 
the 11 digit numbers required for a long distance telephone 
call) are relatively easy to memorize and manipulate for the few 
seconds needed to assign them to a document or enter them 
into a computer system. In fact, the first few numbers repre- 
senting major categories can usually be recalled without any 
look-up process by those who use the classification regularly. 

In contrast, strings of nonsense letters (mostly consonants) are 
very difficult to memorize and manipulate. The mnemonic ap- 
proach sometimes used to build notations often results m a 
clumsy, complex, lengthy string of characters that are much 
more likely to introduce errors than simple strings of numbers. 

A final comment on the notations is that they must be open- 
ended. Space must be left between major categories or groups 
of categories for the addition of new categories at a later time. 
Even more important, there must be no limit on the length of 
the notation. Although the classification should be organized in 
such a way as to keep the notation as short as possible, it must 
always be possible to add additional characters to the right of 
any notation to reflect new subdivisions. Any classification 
which sets a limit on the length of the notation has unnecessar 
ily restricted growth and evolution of the classification, thereby 
creating increasingly difficult problems in keeping the classifi- 
cation up-to-date. 

Proced|jre for Building 

The desirability of developing new modern classifications rather 
than trying to modernize existing out-of-date classifications was 
stressed earlier, it therefore seems appropriate to suggest some 
useful guidelines for the building of such new classifications. 
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It IS usually easy to outline the major categories for a given sub- 
ject area. I would urge readers to select a topic of interest and 
see just how easy it is to develop a logical outline of major 
categories in just a few minutes. This is the only stage at which 
a tentative structure is based on preconceived ideas. After this 
point, the classification is developed and extensively modified 
only by creating new categories and arranging them in the best 
order for precise indexing of concepts obtained from input 
documents. 

An extremely valuable procedure for organizing major 
categories in a subject area is to construct a two-dimensional 
matrix with different aspects or facets on each axis. For exam- 
ple, the field of radiation biology is best represented by a ma- 
trix of types of radiation vs. types of organisms, organs, cells 
and molecules being irradiated. This rapidly divides the field 
into many major categories that can then be subdivided as the 
need arises. The field of biochemistry logically falls into a mat- 
rix with major classes of compounds (proteins, amino acids, 
lipids, carbohydrates, nucleic acids) on one matrix and major 
analytical subdivisions (synthesis, chemical properties, physical 
properties, uptake and transport by the body, etc.) along the 
other axis. Much of biomedicine fails into" a matrix with major 
disciplines (pathology, physiology, pharmacology, toxicology, 
clinical medicine) along one axis, and organ systems (lung, 
liver, stomach, skin, bone, etc.) along the other axis. Similar 
matrices can be constructed to cover large areas of information 
In most subjects. 

Each intersection of the two axes becomes a unique, distinct 
category in the classification, although sometimes a whole row 
or column from the matrix is used as a category. The notation 
should be synthesized from numbers that show how the major 
category was synthesized. For example, major disciplines can 
be assigned notations as follows: 52. for pathology. 53 for 
physiology, and 54. for pharmacology, etc. Organ systems can 
be assigned as follows. 43 for kidney. 52 for lung. 83 for skin, 
etc. Intersection of these two sets of numbers gives 52.4.3 for all 
kidney pathology, 53.52 for physiology of the lung, and 54.83 
for pharmacologic agents that act on the skin. When combining 
two sets of numbers, the numbers representing the most 
open-ended and detailed aspect must always be placed to the 
right of numbers representing the broader aspects. 
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The important point in giving these examples is that use of 
matrices is the best way to build precoordination into the 
categories as the classification is developed. It is quite different 
from precoordinating or post coordinating individual categories 
after the classification is developed. 

ThQ next step after the initial set of major categories is created 
IS to identify representative documents that fall m the selected 
area and assign concepts from those documents to categories 
in the classification. During this phase, it is necessary to add 
many new categories and subdivisions of existing categories to 
the classification— often at the rate of several new categories 
per document. In addition, existing categories must be shifted 
to new locations, deleted, or completely revised by using new 
words to reflect increased or decreased scope. 

The structure of the classification must be extremely fluid and 
flexible at this stage. It should not be used for final indexing of 
any document until many hundreds of documents representing 
the whole subject area covered by the classification have been 
used to improve and flesh-out categories in the classification. 

During this developmental period, the classification should 
match the conceptual organization, the way of thinking and the 
conceptual framework presented by authors of the documents 
and by representative users. Clearly this requires extensive 
input from subject experts and review by those who are actively 
working in the subject area. 

The emphasis in this process of developing a new modern clas- 
sification is on a very practical, pragmatic, empirical approach 
with as few rules of constraints on the structure or organization 
of the classification as possible. Whatever wording or organiza- 
tion of categories works best and seems most useful should be 
u^ed. Any attempt to use vague, general or artificial concepts 
(i.e., personality, matter, energy, space, time, etc.) to organize 
or to create categories is of no value in developing a specific 
modern classification that accurately corresponds to the struc- 
ture and organization of the subject area it will be used to 
index. 

Even after the early pilot phase is completed and use of the 
classification for routine indexing begins, the same flexibility tn 
revising the categories as soon as the need arises is required if 
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the classification is to remain viable. The best person to make 
these changes is the individual who is indexing documents or 
formulating a search and sees the need to incorporate a new 
concept into the classification. This should be done im- 
mediately, the first time such a need is identified. The change 
must be made quickly and easily. It cannot wait for a decision 
by a committee. 

An indexer or searcher should consult with others if there is 
any question about where to place the new concept in the 
hierarchy. However, if this is not possible, the concept should 
still be entered into the classification at once. Subsequent re- 
view of changes may shift the category at a later time, but in 
the rrseanwhile it is available for use the next time the same 
concept is encountered. The idea of postponing a needed 
change until review by some elite 'authority" is completely un- 
acceptable since the indexer or searcher is most likely to know 
how the new concept is described and related to the rest of the 
subject area by the author of the document or the individual 
requesting a search. 

The process just described is designed to keep the classifica- 
tion modern and up-to-date. The focus must be on making the 
classification even more useful a year and ten yeais from now. 
rather than or> whether the indexing of a few documents today 
or in the past becomes invalid because'of a change in the clas- 
sification. 

In this connection, it is worthwhile mentioning that the type of 
modern classification advocated here has very little need for 
any manual reiridexing of older documents when the classifica- 
tion IS changed. Inserting a new concept or broadening the 
scope of an existing category has minimal effect on past index- 
ing. Shifting a category to a new location with a new number 
can be followed by a corresponding change made automatically 
in the computer files. Subdividing a category means that 
documents indexed under any more general category can still 
be retrieved if the requester is willing to accept documents that 
could only be indexed at the more generic level (either because 
the category had not yet been subdivided or because the 
document covered so many aspects of a major category that it 
would have taken too much time to post it to all the subdivi- 
sions of that category)-. The inclusion of more generic 
categories reduces the precision of the retrieval, but is an ex- 



ERIC 



54 



Modern Classification 

50 



cellent recall device v/hich should be used for all searches un- 
less the user specifies otherwise. 

Three additional comments on building a classification may be 
useful. First the categories should be organized and selected 
in such a way as to keep the notation as short as possible. 
Second, it is highly desirable to divide each category into only 
five or six major subcategories so that only one number or letr 
ter needs to be added to the notation for the major category. If 
this is Impossible, then it is perfectly permissible to have up to 
99 subdivisions and use two digits after the notation of the 
major category to represent each subdivision. Third, long lists 
of spiecific items (compounds, chemical elements, names of or 
ganisms, etc.) that need to be Itemized under a major category 
are best arranged in alphabetic order with the first few charac- 
ters of each word, followed by one or two numeric digits incor- 
porated into the notation. In this way, the order of the notation 
reflects the alphabetical order in long lists of terms that all fall 
in the same class (antibiotics, bacteria, countries, names of in- 
dividuals, etc.). Such alphabetic lists, imbedded in the classifi- 
cation, are much easier for both indexer or searcher to use 
than groupings of items into artificial subclasses. 

In closing this section. 1 might mention that a computer system 
named AUTOCLASS has been designed for automated creation 
and updating of both the classification schedule and the al- 
phabetic index that accompanies it.^ Changes in categories re- 
sult in corresponding changes in the alphabetic Index, in 
cross-references to the changed categories and in cross- 
reference statements within the changed category, since all 
these linkages are recorded and used by the computer dunng 
&ach update step. Lists of changes made during the updating 
of categories and Index teims or cross-references that need to 
be checked as a result of the changes are printed out during 
each update cycle. The existence of this type of automated sys 
tem makes it much easier to build and update a clas: ^cation 
than was previously possible. 
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Use in an Automated Environment 

No matter what system is used for indexing and retrieval, it can 
probably be improved by using a modern classification In com- 
bination with the existing system. A good example of a hy- 
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bnd system would be the combination of free-toxl searching* 
(to retrieve on any specific term in titles and/or text) plus 
categories from a classification (to permit retrieval of small 
groups of specific documents without having to specify every 
term needed to identify those groups). 

Modern classifications used to supplement existing systems 
can be very simple, consisting of only a few dozen categories 
on a list that is checked off for each document entered. Or they 
can be much more extensive, approaching the type of deep 
multilevel hierarchy advocated in earlier sections. 

Another use of modern classifications is for automatic mapping 
of words in free-text search systems. If all searchable words or 
terms are arranged in a deep multilevel hierarchy or word trees 
it is possible to use this classification as the front end to the 
search system. When the user enters a word, the computer can 
use the classification to identify all the terms and words sub- 
sumed under the selected word and include them in the search 

This use of the computer to " expand" or "explode * or "map" a 
term can either be built into the computer as an automatic fea- 
ture, or it can be an option which the user must ask for. Alter- 
natively, the computer can display all the subsumed terms or 
categories that are narrower than the entered term (along with 
all the near-synonyms and related terms) and let the user 
choose those index terms or categories he wants to include in 
the search. 

Modern classifications are also useful for computer-aided in- 
(jGAing at the time of input. The indexer enters a word or term 
or category into the system, and the computer displays the 
categories from the classification (or all the narrower terms 
from a classWied thesaurus) that are equivalent, narrower than, 
or related to the entered term. This permits the indexer to 
select additional terms or categories from the displayed infor- 
mation by touching them with a light pen or some other 
touch-sensitive device. 

To go even one step further, modern classifications are very 
useful for sophisticated automatic indexing or more precisely 
for automatic assignment of a document to classes or 
categories. If the alphabetic term list which is required as an 
entry vocabulary for the classification is very extensive, then 
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every significant term present m the informatjon being indexed 
can be looked up in that term list. The category numbers as- 
signed to each term automatically place the term in its most 
logical location in the hierarchy of the classification. If a multi- 
meaning term has several category numbers, then selection of 
the correct category is based on clues supplied by other 
categories already assigned to the document, particularly those 
categories identified by terms in sentences or paragraphs adja- 
cent to the multimeaning term. This type of automatic classifi- 
cation or grouping into categories is based on content analysis 
that IS supplied by the classification rather than on purely 
mathematical clustering algorithms. 

Modern classifications are also an excellent mechanism for 
facilitating the exchange of information between two or more 
information systems, including systems located in many differ- 
ent countries. A modern classification can be developed to in- 
clude a category for every concept (expressed by a variety of 
words, terms and phrases) in each of the independent systems. 
This new "central" or "common" jclassification forms the link 
between each of the other systems. The type of deep, multi- 
level, open-ended hierarchical modern classifications advocated 
in this paper are extremely useful for this modern application of 
classifications. 

sot systems which depend on precise matching of users with 
documents is still another area where modern classifications 
are of particular value, since categories can be subdivided to 
identify the specific interests of each user. The special uses of 
modern classifications described m the last few paragraphs do 
not in any way detract from the value of modern classifications 
to improve the retrieval performance of other types of informa- 
tion systems, as described in other sections. 

Problems i 

Previous sections have stressed the many advantages and use- 
ful applications of modern classifications. Before concluding 
this paper, some of the disadvantages and problems must also 
be mentioned. These arise mostly from the need for subject ex- 
perts with extensive background and experience who will un- 
dertake the development and oversee the continuous updating 
of the classification and its associated alphabetic index. The 
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decisions that these experts must make regarding the words 
used to describe each category, the best subdivisions of each 
category and the most logical arrangement of categories are 
critical to the successful use of a modern classification. As a 
result of this need for expertise, the development and updating 
of a modern classification is more expensive and time-con- 
suming than the creation and maintenance of other types of 
indexing tools. 

However, th^ifference in time and cost may be more than off- 
set by the fact that information indexed by the classification is 
better organized and analyzed and easier to retneve by knowl- 
edgeable users than information indexed by most other 
methods. Since the number of users is usually many mag- 
nitudes larger than the number of experts who develop the 
classification and the total amount of retrieval time at multiple 
scattered locations is several magnitudes larger than the time 
spent on indexing, it is worth the extra effort to build and use 
the best possible indexing tool if it saves time and effort on the 
part of the users, . 

The size and complexity of enumerative hierarchical classifica- 
tions with many See Also linkages between categories and links 
between the classification schedule and an alphabetic index to 
the categones present a major maintenance problem In the 
past this complexity has discouraged revision, and classifica- 
tions have gradually become obsolete for lack of updating De- 
velopment of modern automated systems for easier revision 
(such as the AUTOCLASS system mentioned previously) should 
significantly alleviate this problem. It must also be stressed that 
all indexing tools suffer if they are not continually updated and 
that this problem is not unique for classifications 

Perhaps the biggest problem of all is whether any indexing tool 
IS cost-effective. Free-text searching of any word m a title or 
abstract makes it possible to retneve much useful information 
without any human indexing process. The extent to which this 
retrieval can be improved by using a modern classification or 
any other indexing tool, and whether this improvement is worth 
the added cost of the indexing, are questions that can only be 
answered by experiments which test retrieval performance of 
the various systems under carefully controlled conditions. Only 
a few results of these tests, such as the comparison of free-text 
indexing with the use of a modern classification mentioned ear- 
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her (see notes 2 and 3), have been published. They urgently 
need to be confirmed and extended by other researchers. 



Conclusions 

This paper has identified some attributes of a modern classify 
cation, and discussed why modern classifications are of value, 
how they should be structured and organized, how they lead to 
a useful balance between precoordmation and post coordina 
tion. the type of notation and index needed for a modern clas- 
sification, some guidelines for building a modern classification 
and some useful applications and disadvantages of using mod 
em classifications. Stress has been placed on the use of 
enumerative multilevel hierarchical classifications and their ad 
vantages. It is hoped that readers will be stimulated by some of 
the ideas presented here to try to build such modern classtfica 
tions for indexing information in various subject areas and to 
see for themselves how useful such classifications can be for 
achieving higher recall and higher precision than is possible 
with other types of indexing tools. 
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The Dewey Decimal and 

Library of Congress Classifications; 

an Overview^ 

Maurice F. Tauber and Hilda Feinberg 



In the United States, two classifications are used primarily for 
the organization of materials in libraries, the Dewey Decimal 
Classification (DDC) and the Library of Congress Classification 
(LC). Each has inherent advantages and disadvantages for dif- 
ferent types of libraries. Libraries have, in general, made their 
systems fit the needs of readers. Hov/ever, as a rule, the closer 
the classification follows the order of the classification of 
knowledge, the more fully it serves the purpose of grouping to- 
gether the books and ideas which are related. The basis of lib- 
rary classification by subject is the assumption that books on 
the same or related subjects will frequently be used together^ 
The classificationist attempts to develop a scheme which will 
arrange books on the shelves in an order that will be recog- 
nizable as following some definite plan, will be in harmony with 
current studies, and will enable the finding of books together 
which have some likeness in a greater or less degree.* ^ 

Sayers has outlined the essentials of a library classification. 
What makes the value of one system as compared with another 
is its generalness of character, its order, the logical process of 
its subdivision, the quality of its terminology, and (at a later 
state) its practicality as shown m its notation and indoxing.^ 
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The major disciplines should be represented, and they should 
be given space relative to their size, there should be flexibility 
to allow for extension of developing disciplines, reduction of 
contracting disciplines, and movement of disciplines or parts of 
disciplines from one section of the classification to another to 
express changing relationships/ 



Dewey Decimal Classification 

The most widely used scheme, and the oldest, is the Dewey 
Decimal Classification (DDC). It was devised in 1873 by Melvil 
Dewey for the Amherst College Library. First published in 1876. 
the arrangement of the classes was based to some extent on 
the classification scheme devised by W. T. Harris for the St. 
Louis Public School Library m 1870. which in turn was derived 
from Bacon s Chart of Learning. 

As described by Mills, the importance of DDC lay in two sig- 
nificant advances it made over previous systems. 1) a notation 
was devised which exhibited great simplicity and flexibility, 
permitting a flexible shelf arrangement, 2) the comprehensive 
Relative Index, which showed those relative aspects of a suo- 
ject which the systematic order scattered, solved to some ex- 
tent what until then had been considered a serious drawback to 
systematic order.^ The principle of relative location of books on 
the shelves was introduced, whereby the order of the books fol- 
lowed that of the classification scheme. This replaced the pre- 
viously employed fixed location system of classifying books in 
libraries m which books were arranged according to size, ac- 
cession number, or other considerations. Dewey's relative loca- 
tion meant that new titles could be inserted in their proper 
places alongside^imilar works already on the shelves without 
having to change the existing location symbols. This permitted 
continual moving of books from one shelf to another without 
destroying the logical order. The place of each book on the 
shelf was always the same m relation to the books on either 
side of It, although its actual position varied as books were 
added to the shelves, moved or withdrawn from the collection 

Since the basic arrangement of DDC is systematic by conven- 
tional disciplines (history, literature, chemistry, etc.), and any 
given subject may be dealt with from various aspects and be 
classified m more than one discipline, it is the purpose of the 
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Relative Index to indicate all significant relationships between 
topics and show the relation of these topics to those in other 
areas, as well as their dispersion throughout the schedules.^ 



From 1876 to 1942, fourteen editions of DDC appeared. Since 
1894 an abridged 'edition has been issued for small libraries 
and school libraries. At present, the eighteenth edition of the 
unabridged edition and the tenth edition of the abridged DDC 
are available. The abridged edition is useful for school and pub- 
lic libraries that do not predict growth larger than 20,000 titles. 

The Dewey Classification may be described as an enumerative 
classification with provision of synthetic devices m some areas. 
As noted by Needham: 

Even schemes which are predominantly enumerative usually 
provide synthetic devices to cater for common form-diviston, 
space and time elements— for clearly all of these would 
otherwise have to be enumerated at more or less every divi* 
sion in the schedules. Any attempt to enumerate complex 
subjects is in practice found to be selective, it could never 
hope to encompass the unpredictable multiple relationships 
found in literature. Additionally— though this need not 
necessarily follow— it is likely that the enumeration will be 
unsystematic.^ , 

Needham concludes that enumerative schemes are likely to 
have the following limitations in the schedules:^ 

Omission of some simple and complex subjects, duplication 
of others. 

Conflicting principles underlying the placing of complex 
subjects. 

As an example of the latter, The harvesting of potatoes may be 
found under Potatoes, The harvesting of wheat under 
Harvesting. Materials on the same subject may be found in two 
or more places. 

Dewey incorporates numerous synthetic devices as may be rep- 
resented by standard subdivisions, area tables, tables providing 
for subdivision of individual literatures, languages and other 
provisions. Recent editions indicate increasing use of synthetic 
elements, offering broader hospitality to complex subjects. DDC 
is becoming fuller in coverage and more capable of displaying 
complex topics.^ 
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The notation symbols of DDC and other classrfications are not 
necessarily connected logically with the principles upon which 
the formation and arrangement of the classes and their sub- 
divisrons are based, as they are added subsequently to the cre- 
ation of the classification. The notation symbols are used to 
identify and shelve the books. The DDC notation is expansible; 
a new number may be created by the addition of another digit! 
The length of the notation to be used is determined by the indi- 
vidual library, taking into account its size, character and proba- 
ble rate of growth. The small general library should find a brief 
notation of three to five digits satisfactory. The problem related 
to the complexity of long numbers resulting from attempts to 
gain greater specificity in close classification has resulted in a 
polrcy of segmentation of the notations to make them adaptable 
for libraries of varying size, for example: 258'.2'0922 may be 
segmented as 285. as 285.2. or may be used in its entirety at 
285.20922. Since 1967. DDC numbers on Library of Congress 
cards have appeared in from one to three segments. The prime 
marks, which are not considered part of the notation, identify 
the varying levels at which notation is meaningful. Such seg- 
mentation makes it possible for a library to cut excessively long 
notations to more acceptable shorter numbers. 



Criticism of Dewey Decimal Classification 

Arrangement In examining the arrangement of the classifica- 
tion, consideration should be given both to the order of the 
mam classes and the order within the classes. No one type of 
arrangement is followed throughout the scheme— a number of 
arrangements, both natural and artificial, are employed. Yet. a 
logical process of division and subdivision of main classes is 
carried out in most instances. Dewey employs an arbitrary ar- 
rangement in some cases where an alphabetical arrangement 
would probaWy be more desirable from the point of view of the 
user, for example In class 546. an alphabetical arrangement for 
chemical elements would be preferred by the chemist. Alpha- 
betic arrangements are provided in a few places in the 
schedules and auxiliary tables. 

Another example of inconvenient arbitrary arrangement is the 
schedules for music. 780. The order of the numbers does not 
correspond to the importance of the subjects. The order of its 
classes has been criticized as separating the social sciences 
(300) from history (900). language (400) from literature (800); 
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and the separat.on of a part.cular science from 'ts tech"°^9y 
For example, the separat.on of chemistry from chem.ca tech 
nology invites criticism from some chemists using the classifi- 
cation. 

The order of the mam classes is not of crucial practical signifi- 
cance, particularly ,n larger libraries. In fact, the editor of the 
eighteenth edition stated: 

Th» nrimaru basis for DDC arrangement and development of 
lubiec Tfs bv discml ne as defined by the mam and subor- 
Se cl'sse^'whire sub.ect. str-ctly speakmg. is sec^^^^^^^^^^ 
There is no one place for any subject in "'seH. a subiect may 

by discipline.'* 

Of greater concern than the order of mam classes is the rigidity 
resulting from a strict division by tens." Accordingly many 
Classes and subclasses are overcrowded, notably where the 
scheme fails to provide sufficiently for the mterests and re- 
quirements of foreign, scientific or technical libraries In recent 
editions, the editors havfe attempted to ^^"1°^%^°/"^^"'"'^'"' 
of bias and to deemphasize the Western bias of the schedules 
As an example, the non-Chr.stian faiths are developed m more 
detail m the 200 classification An additional provision recom- 
mends an option of using a letter or other symbol as an art.fi- 
cal digit to bring into prominence specific linguistic, ethnic or 
cultural approaches. 

Other Cnticisms Criticism has been expressed about a lack of 
Toresight m relation to the growth and change in techn.cal^and 
sciennfic areas. Each edition attempts to update the scheme to 
keep pace with expanding knowledge through expansion of ex- 
Slmbers. by the addition of more ^^^bd.v.sions a^^^^^ 
through relocations of topics in the ^'^^^^"'f '^^""f^ 
within the framework of the official policy "^P'^ '° 

□reserve the integrity of numbers, which means that vacated 
numbers cannot be used for new topics-at least until a time 
whTn such a change would be^f relative insignificance to most 
users oUhe scheme Dewey realized that a classification which 
changed ,0 a substantial extent with each new edition wou d 
not be acceptable to librarians Changes m each edition force 
"he I'bra lan'to consider reclassification, requiring altera.on of 
notalions on catalog entries, reshelvmg. and refiling, entailing 
additional time and expense 
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A number of other objections to DDC may be cited. No Gutter- 
ing Is supplied for DDC classification numbers on LC cards, a 
deficiency which requires time-consuming and costly shelflist- 
ing operaiions.^2 vvhile DDC numbers have appeared on many 
LC cards since 1930, the number of titles classified by Dewey 
numbers has varied markedly since the service began— from a 
high of 99 percent of all cards prepared In the fiscal year 
1933/34, to a low of 24 percent in 1965/66J3 The DDC number 
on LC cards should be considered as only a suggested number 
which may no longer be valid if one is using a new edition of 
DDC. Libraries using DDC are obliged to perform expensive 
original classification for a substantial percentage of their titles. 

DDC has been criticized as being too permissive. "This is a 
boon to custom cataloging or to local cataloging preference, 
but a Pandora's Box in centralized cataloging. Examples are 
the classification of biography in 920 or- the subject number, 
bibliography in 016 or the subject number, and extension of 
class numbers or building numbers beyond what is given on 
the LC card. '5 

The advantages that have been attributed to both the Dewey 
Classification and the Library of Congress Classification have 
been exhaustively recounted in the literature. Among the advan- 
tages of DDC are its up-to-dateness with successive revisions 
and Its mnemonic features. Its notation is simple and com- 
prehensive, but the length of notation used in many libraries 
presents a definite problem. It ts adaptable for use in libraries 
of varying size and kinds. However, the Classification Commit- 
tee. RTSD Cataloging and Classification Section, in 1964 rec- 
ommended Dewey for libranes with general collections up to 
200.000 volumes in size, and the Library of Congress system for 
those expected to be larger and for those small libraries with 
specialized collections.'^ 



Use of the Dewey Decimal Ciassificatlon 

In 1960 it was reported that some 95 percent of public libraries, 
nearly 90 percent of college and university libraries and over 60 
percent of special libraries in the United States used DDC. In 
Great Britain, over 500 libraries used it. It was reported at that 
time that it had been translated into some nine European lan- 
guages, and into Chinese and Japanese.'^ 
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DDC has continued to be effective for most libraries for almost 
a century. It may be found in some form throughout the world. 
Of the sixteenth edition, 25 percent of the copies sold were to 
libraries outside the USA.i» On the continent of Europe, the 
Universal Decimal Classification (UDC), a derivative of DDC. is 
used to a great extent, in spite of the fact that it uses long 
numbers and is subject to many changes. 

Since 1934 the Decimal Classification Section of LC has period- 
ically issued Notes and Decisions on the Application of the 
Decimal Classification. Since 1950, DDC has been used for the 
arrangement of the British National Bibliography,^^ Both the 
R. R. Bowker Company and the H. W. Wilson Company use the 
DDC in their bibliographic publications. Dewey numbers may be 
found in Publishers' Weekly, American Book Publishing Record, 
Book Review Digests the Standard Catalog Series, the ALA 
Booklist, New Serial Titles, and in several national bibliog- 
raphies. In addition, many commercial processing firms are 
prepared to classify by Dewey.^ The DDC numbers on LC 
cards, on Wilson cards, and numbers derived from other pub- 
lished sources should not be accepted without further checking 
of local shelf lists and policies. Among other factors, one needs 
to know from which edition the numbers have been assigned 

While many of the reasons for abandoning DDC lie within the 
classification itself, some of the contributing factors have been 
outlined by Maltby.^' Failure of libraries to accept changes in 
succeeding editions of Dewey and tinkering with its numbers 
serve to lessen its practical use, absence of DDC and book 
numbers on purchased cards, as noted previously, increase the 
cost of classification, concern over tho lengthening notation of 
DDC, routine recommendations to classify made by library sur- 
veyors, and a sincere conviction that another classification is 
better designed and more appropriate are all contributing 
factors. 



Library of Congress Classification 

Second in usage to the Dewey Decimal Classification in this 
country is the Library of Congress Classification (LC). an 
enumerative scheme which was originally developed from an 
incomplete expansive classification founded by Charles Amni 
Cutter. The LC classification was designed to be a pragmatic 
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and expansive scheme for the holdings of the Library of Con- 
gress and what it might be expected to add in the future. The 
individual subject schemes of the classification were indepen- 
dently created by a number of subject specialists for each dis- 
cipline, who worked individually and in groups, and have been 
published by the U.S. Government Printing Office since 1901. 
Each main class is published separately, and though developed 
separately, each represents a unified part of an overall scheme 
for the organization of library materials. Since very few libraries 
expect to grow to the size of the Library of Congress, the 
scheme represents adequately in most cases the needs of the 
majority of libraries in this country Because of the extremely 
large holdings of Ihe Library of Congress, and its status as a 
depository for copyright works, the schedules are generally 
comprehensive and provide to a \dtge extent for the scholarly 
works likely to be held by academic, research and large public 
libraries. 

Expansion of the classification is governed by literary warrant, 
depending upon the acquisition of new materials by (he Library 
of Congress. Thus, the development of the classification is di- 
rectly affected oy the acquisitions policy of the library. 

The original designers of the scheme provided for expansion by 
leaving gaps at places which were predicted to be appropriate 
in the future. Such predictions of the advancement of knowl- 
edge are impossible to foresee, thus the accuracy of the 
placement of the gaps will of necessity be approximate. There 
are five single-letter classes that have not been used, and many 
double-letter combinations available for future expansion. 

The individual schedules are kept current by 1) /.C 
Classification-Additions and Changes (Quarterly) published by 
the Library of Congress, 2) the addition of supplementary pages 
oi Additions and Changes to reprinted editions of Ind.vidual 
scheaules, ^^^d 3) puDlication of new editions of the inotvidual 
classes when appropriate. The Gale Research Company offers a 
compilation, LC. Classification Schedules. Additions and 
Changes through 1970, and Additions and Changes for 
1971-72,^^ 



Criticism of the Library of Congress Classification 

Arrangement. The ciassificatton schedules have been built up 
continuously as material requiring new subdivisions and revi- 
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sions in the existing schedules has been added to the Library 
of Congress collection. While there is general uniformity of 
structure and format throughout the schedules, the classes, di- 
visions and subdivisions have been developed and revised to 
meet the needs and use made of the large collection. Conse- 
quently, no strict uniformity among individual schedules in re- 
gard to subdivision for form, geographic areas or periods is 
evident. Subjects are followed by subject subdivisions, progres- 
sing from general to specific as far as possible. The schedules 
frequently.provide for an alphabetical order of subdivision for 
subclassification employing topical Cutter numbers to represent 
individual topics, rather than classified subdivisions. 

Many classes are equipped with special tables and directions 
for subdividing the classes more minutely. These tables are 
peculjar to the one subject to which they apply and can seldom 
be used to subdivide other topics, thus they exhibit minimal 
mnemonic characteristics. While no facets are common to the 
whole scheme, the separate classes are provided with some 
synthetic devices to varying degrees. Classes H and P have 
synthetic capabilities, class Q does not.^^ 

The order of the mam classes, although somewhat arbitrary, is 
based on the major traditional disciplines. The quality of detail 
varies from one part of the scheme to another. The LC is com- 
prehensive, but not universal. As might be expected, a genera! 
classification scheme designed for a library identified as the 
congressional library of the country places emphasis on politi 
cal and social sciences and on history While providing for 
these areas in depth, LC offers as well, a comprehensive treat- 
ment of language and literature The Library of Congress makes 
available through its printed cards, book catalogs and MARC 
tapes, classification numbers for the major subjects likely to be 
represented in general libraries of all sizes For subjects like 
law, medicine, science and technology, many libraries with ex- 
tensive holdings have had to use special schemes. The Library 
of Congress does not assume responsibility for comprehensive 
collecting in such speciaJ fields, and thus cannot provide as de 
tailed a classification as might be needed by specialist libraries 
The National Library of Medicine has its own classification.^* 
and this has been adopted by many other medical libraries 

Other Cnttctsms, As the result of a broad survey, the basic 
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satisfaction of librarians with LC has been confirmed.^^ Among 
complaints were a lack of a general Index to all of the 
schedules, the fact that many parts of the schedules lack ade- 
quately detailed instructions, the difficulty of keeping track of 
changes in classification in the present format (a loose-leaf or 
index-card method of publication is preferable, ideally, new re- 
vised schedules incorporating the changes should be printed 
more frequently), the failure to supply author numbers in the 
literature schedules, since most academic libraries do not use 
PZ 3 and PZ 4, and the lack of a manual of instruction for ap- 
plying Jhe scheme. 
\ 

By far the most frequent reason given for not accepting LC 
without change was that the library did not use the PZ fiction 
class, but instead classified such titles in the various national 
literatures. As noted by Bead, perhaps no decision has pro- 
duced more comment than the grouping of all fiction In English 
in PZ 3 and PZ 4. This material includes not only American and 
English fiction, but also foreign fiction translated into English 
As indicated by Bead, the original purpose of classing all fiction 
m English in PZ 3 was no doubt to bring together at the Library 
of Congress a special collection of fiction, arranged alphabeti- 
cally by author, which*a reader could easily use for browsing 
without first consulting the catalog. 

Some progress m meeting some of the above objections has 
been accomplished. Two general indexes to LC were an- 
nounced In 1974. An Index to the Library of Congress Classifi- 
cation, with Entries for $pecial Expansions in Medicine, Law, 
Canadian and Nonbook Materials, Canadian Library 
Association, 27 and Combined Indexes to the Library of Con- 
gress Classification Schedules, edited by Nancy B Olson, U.S 
Historical Documents Institute 2** As mentioned previously. Gale 
offers L.C. Classification Schedules. Additions and Changes 
through 1972. Considerab'e interest exists to create a general 
manual of instruction. While no manual has yet been issued by 
the Library of Congress. The Use of the Library of Congress 
Classification by Schimmelpfeng and Cook^ and A Guide to 
the Library of Congress Classification by lmmroth^° offer some 
degree of assistance, in regard to fiction. Library of Congress 
cards for fiction in English now include, in addition to the usual 
LC number, another number for the nationality and period of 
the author. 
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Use of the Library of Congress Classification 

In whole or in part, the scheme is being used increasingly, par- 
ticularly in academic libraries,^' ^ZHoage m 1961 located 256 
libraries using the LC system.^ Richard Angell indicated that 
between 800 and 1000 libraries were using LC in 1964.^ He 
predicted that in the ensuing eight years this growth would 
double. The growth has been marked, ana is related not only to 
libraries changing from another classification, but also to new 
developing libraries and to departmental libraries of univer- 
sities, particularly science and technology collections. In addi- 
tion to academic libraries, there have been a number of public 
libraries as well as state, historical and special libraries which 
have adopted LC. In regard to the size of libranes using LC. 
there is evidence that this is not a significant factor. Small as 
well as l^rge libraries find the classification appropriate for 
their collections. Over the years, there has been a tendency to 
regard LC as a system only for arranging materials for large re- 
search libraries. In recent years it has become clear that the 
system is suitable for all types of libraries even, in some cases, 
school libraries. Some foreign libraries, particularly those in- 
volved With governmental responsibilities and general research, 
have accepted LC as an effective arrangement of matenals for 
their use and services. There is no question about the ability of 
adults to use books arranged by LC. There ws^ some ap- 
prehension about children and young people not being able to 
locate materials classified by this system. This has been dis- 
proven by the experiences of both the St Paul and the Buffalo 
public libraries 



Reclassification's 

The pressures of growth, expanding knowledge and publica- 
tion, and rising costs have made evident the inadequacies of 
outmoded classification systems. In the late nineteenth and 
early twentieth centuries, libraries generally changed from local 
systems to the Dewey Classification. Beginning m the 1920s the 
trend m reclassification shifted toward the introduction of the 
Library of Congress Classification as the advantages of that 
system for large research libraries became apparent In recent 
years, a significant development has been the indication that 
LC is suitable for smaller libraries too. 
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A survey conducted in 1966 revealed that 85 percent of those 
libraries approached which had shifted to LC had formerly used 
Dewey or Modified Dewey. ^6 survey indicated that 59 
percent preferred to reclassify the entire collection, 41 percent 
did not. Usually the overriding reason for adopting partial re- 
classification IS economic — recataloging projects are expensive 
Although a librarian may be cognizant of the difficulties and 
costs involved in programs of reclassification, he may institute 
the project on the basis of two assumptions, 1) that the use of_ - 
a classification such as that of the Library of Congress (for'^ 
most changes have been directedJoJLC)«clireves a grouping of 
the books in the collecUoiv^lTSins of greater educational sig- 
nificance and shows the users the currently accepted relation- 
ships among the branches of knowledge more effectively than 
did the system tliat is being replaced, and 2) that the adoption 
of a new classification, which involves abandoning a system 
that has been found expensive to handle technically, will in the 
long run be an efficient administrative device. These assump- 
tions are based on the testimonies of librarians who have grap- 
pled with the problem and on the results of general surveys. 

The reasons for reclassification include economic considera- 
tions, problems relating to the system m use. the desire to im- 
prove services for the user, and reasons relating to administra 
tive factors. DDC is the most abandon*^^ library classification, 
and LC the most frequently adopted/*^ 

In the Report of the Classification Committee. RTSD Cataloging 
and Classification Section, 1964. on the type of classification 
available to new academic libraries, it was stated that 

LC has the advantage of not being logical in exposition, as a 
rule, and while it is practically impossible to memorize, it is 
easy to expand without upsetting existing classified books 
The advantage of a non-logical classification is apparent m 
dealing with rapidly advancing subjects, as the sciences, 
where a major change in thought can throw out a whole 
branch m a previous arrangement of knowledge LC can in- 
terpolate where DC must compromise. 

Dewey has to be expanded through further breakoown. 
sub-classification or re-nammg and reassigning classes LC 
can be expanded by interpolation because the whole system 
does not have to be logical but can, to a considerable de- 
gree, grow like Topsy without regard to its environment It 
has been possible to abndge Dewey, but not LC ^ 
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In regard tovnotation the report added that the mixed notation 
of LC Is more complex than the pure notation of DDC. How- 
ever, the LC numbers on the average are shorter than DDC 
numbers. Dewey's notation is positional, each position repre- 
sents a classification level. LC notation is ordinal. Each class 
has a number of its own. not necessarily related to preceding 
or following classes. LC is much broader and more com- 
prehensive than DDC.3^ 

The Library of Congress Classification is supported by the sub- 
stantial resources of the world s largest library operation. Any 
reasonably comprehensive classification system developed and 
maintained by the considerable means of a federally supported 
agency, that is, the Library of Congress, is the logical classifica- 
tion system for general library use. The Library of Congress 
card service is backed by some of the best trained profession- 
als to be obtained. Through its MARC program. The National 
Program for Acquisitions and Cataloging, its Cataloging-in-Pub- 
lication Program, its book catalogs and its cards, it represents a 
true cooperative and centralized operation. The program 
of centralized or shared cataloging on an international basis 
brings to the library a greatly increased inflow of material which 
will, m effect, increase cataloging production and will, in turn, 
be responsible for a substantial increase in the establishment of 
new numbers.^^ The principal advantages of using LC are 
economy in cataloging, speed m processing and the benefits to 
be realized from tying into a large centralized cataloging 
operation/2 

The advantage of conversion to LC lies pnmarily m accepting 
the classification numbers as they appear on the cards, other- 
wise the economy is not fully realized Unnecessary checking 
and verification of data on the cards should not be performed 
except in cases of obvious errors Changes in LC call numbers 
and other variations results in a situation where the library can- 
not take advantage of the Library of Congress services. A large 
number of libraries do not. or are not able to. take full advan- 
tage of centralized cataloging We cannot expect the program 
of cooperation and centralized cataloging and classification to 
be any more than empty words unless catalogers stop thinking 
of all kinds of reasons for not taking advantage of it. 

It has been determined that there are fewer changes in LC class 
numbers than in DDC. and that Library of Congress cards give 
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LC class numbers plus LC Cutter numbers on 85 percent of the 
cards/^ DDC numbers appear on Library of Congress cards for 
about 35 percent of titles for which cards have been printed, 
mcluding titles m all languages. However, as indicated by Ben- 
jamin Custer, over 95 percenj of cards sold are for English lan- 
guage titles, and an analysis of these cards received by a sam- 
pling of orders in this country indicated that approximately 80 
percent contained Dewey numbers/^ On this basis, it may be 
estimated that, for the type of materials collected by an under- 
graduate library m the United States, close to 80 percent of the 
Library of Congress cards may be expected to contain numbers 
representing different editions of Dewey, provided that the Li- 
brary of Congress continues to assign Dewey numbers at the 
same level as were assigned at the time that the analysis was 
made. 



Conclusions 

Recent developments m the application of computers to li- 
braries, and the planning and establishing of networks of ail 
kinds (national, regional, state and others) must force classifi- 
cation reevaluation. The fact that centralization of biblio- 
graphic processing through automation is closer to reality than 
at any time in the past century is a strong impetus against con- 
tinuing one's provincial ways. Under these conditions it will be 
^ome imperative that large libraries consider hovt/ they may be 
assimilated into a national network. Shell has expressed the 
opinion that LC can be programmed to do all that we have re- 
quired of an enumerative scheme up to the present, so that ef- 
fective electronic searching, printouts of lists of materials for 
any segment of LC. book catalogs, inventory control, etc.. can 
all be done with the aid of computers. Future demands for 
more sophisticated searches may have to be met by the apph- 
cation of a new language which will be used for certain types 
of in-put and information retrieval, but not for the organization 
of books on the shelves or in the card catalogs."'*^ Mines has 
shown that both Dewey and LC notations may be manipulated 
by computers. He encountered no problems in programmirg. 
arranging or finding of items m either DDC or LC.**^ DDC does 
not offer the advantage of a purely numeric notation when the 
complete call number is considered. 
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Angell has indicated two factors which may influence the LC 
classification in the future. First, the transfer of the library's bib- 
liographic records to computer operation will render the shelf 
arrangement less important. It is envisaged that consoles may 
replace conventional catalogs as now used, providing the facil- 
ity for browsing which is presently offered by open stacks. Sec- 
ond, there is the need to economize on space in anticipation of 
future accessions. Should arrangement by size be utilized as a 
space saver, the need for classification as a means of shelf ar- 
rangement no longer exists/' Progress towards automation 
cannot be expected to be rapid in most libraries, and may not 
be possible in the foreseeable future in others. There will, 
therefore, be a need to have the schedules maintained accord- 
ing to standards of today. 

Matthis and Taylor note that any perfect system is a dead sys- 
tem, and a classification system based on a total view of 
knowledge is preposterously presumptuous. 

Essentially the argument has now moved beyond theoretical 
discussion of the "best" classification system and settled 
upon the real issue— the promise and prospect of centralized 
cataloging and classification. No one classification system 
will ever solve all of the problems, but the practice of "rug- 
ged individualism ' in cataloging no longer makes sense and 
should no longer be tolerated.^^ 
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The Universal Decimal Classification (UDC) was last presented 
in detail to an American public in the first volume of the Rut- 
gers series on Systems for the Intellectual Organization of 
Information^ by Jack Mills who gave a detailed and closely 
reasoned overview of the theoretical foundations of the system 
The difficult problems and intricate discussions reported in this 
book and the esoteric language used by Jean Perreault in his 
collection of theoretical essays^ may have estranged the last 
remaining adherents of the UDC in this country rather than en- 
dearing the system to them. The erroneous impression was 
created that the UDC was a system fit only for philosophers and 
strange European classiflcationists but not a practical tool for 
information retrieval. This view was reinforced by the popular 
but fallacious notion that classification systems as such were 
outmoded, and in fact dead as doornails, as far as scientific 
and technical information and its retrieval were concerned 
since computers would do the job better and faster if only their 
memories could be made large enough. Added to these artifi- 
cially generated obstacles to the promotion of the UDC as a via- 
ble retrieval tool was the complete lack of a usable English edi- 
tion of the scheme, because no comprehensive schedules have 
been published in English since an abridged edition in 1961 
which is, of course, now completely out-of-date. 



Present Applications 

The scheme, however, is alive and well in most parts of the 
world, including the American contmera (in particular in the 
Latin American countries), with the sole exception of the United 
States where it is still looked upon as an oddity rather than a 
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Viable retrieval tool despite valiant efforts of some American m- 
formation scientists to demonstrate not only its usefulness but 
also its applicability to computerized stores of information. The 
pioneers in this field were Malcolm Rigby,^ Robert Freeman ^ 
and Pauline Atherton,^ who were followed by T. W. Caless and 
others;^ quite recently, important work has been done in 
Canada by M.A. Mercier and his collaborators^ * in the con- 
struction of a computerized retrieval system for water resources 
information in project Environment Canada, known as WAT- 
DOC, which is based on the UDC, r4sulting in a concordance 
between the classification schedules and the Water Resources 
Thesaurus of the U.S. Departmerjt of the Interior. Other 
noteworthy practical and large-scale computerized applications 
of the UDC have been made irrGermany,^ and the U.K.,^^ 
Denmark^^ and Switzerland, these and other computer ap- 
plications were summarized.at two international seminars de- 
voted to the topic,^^-^5 and/in a report by Rigby.^^ 
/ 

/ 

The growing interest in i(\e construction of both general and 
specialized thesauri in the latter half of the 1960s led to 
another fallacious idea, namely the superiority of verbal re- 
trieval tools which purportedly could keep pace with changing 
terminology much better than relatively rigid hierarchical clas- 
sification schemes. Two pilot projects were undertaken to test 
the validity of this proposition when UDC was compared with 
the Engineers' Joint Council's TEST thesp.jrus^^ and the MeSH 
subject heading list used by Index Med cus.^^ Results showed 
conclusively that the UDC could hand'e almost all concepts 
listed m the two thesauri, while conversely the thesauri showed 
some serious lacunae in their coverage, on the other hand, 
these projects also revealed some structural faults in UDC 
which, although their nature had been Known for a long time, 
were put in sharp perspective by these comparative tests. The 
overall conclusion drawn from these expe'"ments was that, far 
from being competing retrieval tools, classification systems of a 
faceted or semi-faceted nature (such a"» the UDC) complement 
verbal retrieval tools of the thesaurus. t'pe, and vice versa; 
moreover, a sound classification schenie is, in fact, indispensa- 
ble for the successful construction of a thesaurus. 

Indeed, universal classification schemes are now needed more 
than ever, be it as backup systems in situations where detailed 
information is retrieved by specially devised schemes (verbal or 
classificatory) but where marginal subjects have to be handled 
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by a general scheme, or be it as switching devices ' between 
two or more different retrieval systems that are used 
simultaneously.'^ These aspects as well as many others relating 
to tho theory and practice of the UDC have recently |been 
treated in an excellent and exhaustive book by A.C. Foskett^o to 
which the reader is referred, since it would be presumptuous 
for anybody to try and paraphrase the wealth of material 
brought together there in easily readable and thought- 
provoking form. Foskett's book also deals extensively with the 
present shortcomings of the system, both on the conceptual 
and managerial level, and his proposals for improvements in 
both respects will hopefully have a profound influence on any 
future developments concerning the UDC. 



Although the present situation, as indicated above, is not at all 
as gloomy as it is sometimes painted by people who have but 
scanty knowledge of the UDC, it would be foolish to deny that 
the system is in urgent need of revision and reform. Its 
framework, still largely cast in the mold of the Dewey Decime^l 
Classification (DDC) of which it originally formed an extension, 
suffers from overcrowding in classes 5 and 6, and unhelpful 
dislocations of closely related subjects, as in the notorious case 
of theoretical chemistry (54) being separated from chemical 
technology (66), and other such incongruencies^nd everi' out- 
right follies. The high degree of detail and specialization, once 
the hallmark and pride of UDC, now threatens to suffocate the 
system, too many additions and "refinements" were often made 
by people with little insight into the workings of a general clas- 
sification scheme and interested only in promoting their own 
special field, so that the UDC is now weighed down by un- 
reasonably long and complicated notations and a growing re- 
dundancy of concepts listed in many different partL' of the 
scheme, leading to complexity and ambiguity m application and 
to consequent retrieval failures. 
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Added to this are serious shortcomings in the management of 
the system which have been pointed out by Wellisch^' and 
Foskett22 but have been dealt with so far on a patchwork basis, 
if at all, Foskett also made the proposal to transfer the respon- 
sibility for the English edition of the UDC to the newly-formed 
British Library which might consider the adoption of the 
scheme for the classification of Us open-shelf reference 
collection. 
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While both existing and imagmary shortcomings of the UDC 
were formerly pointed out mostly by individuals (UDC users and 
developers as well as outside critics), the last few years have 
seen more concentrated efforts at constructive criticism and re- 
vision, backed by institutions and by the UDCs own governing 
body, the Central Classification Committee of the International 
Federation of Documentation (FID) Some of the more impor- 
tant of these proposals for renewal and reform will now be 
briefly presented, particularly since some of them are of such 
recent date that they are not yet covered by Foskett s book 



Reform or Revolution? 

Two trends are now clearly discernible one is concerned with 
the upkeep of the present framework of the system in more or 
less unmodified form, making only routine amendments and ex 
tensions while at the same time using new techniques of pre- 
sentation and indexing The other trend is towards a more revo- 
lutionary shake-up of the whole system with the aim of creating 
a new universal classification scheme or. as some of the prop- 
onents of this school of thought hdve suggested, a New UDC 
or NUDC. On the face of it. it may seem that there is a basic 
contradiction here, and that the simultaneous pursuit of such 
divergent aims can only lead to a dissipation of already scarce 
resources in manpower and money which would be better 
spent m concentrating on either one of these trends Although 
such a danger no doubt exists there is some justification for 
proceeding along both lines simultaneously 

The UDC in its present form is still the most widely used system 
of classification for information retrieval, despite its many faults 
and a certain lack of enthusiasm displayed even by its defen- 
ders and users Almost everybody agrees that the system is m 
great need of a thorough overhaul but the design and con- 
struction of a completely new scheme is a major undertaking 
that must necessarily take many years until it can be presented 
to the world, and will then take a few more years to be tried in 
actual retrieval situations so as to debug the system for use at 
least during the remaining decades of this century and perhaps 
beyond Meanwhile, the existing framework must be kept m a 
viable state, (a) by taking into consideration new developments 
tn all branches of knowledge and (b) by putting at the disposal 
of users, present and potential, the tools that make it possible 
^ to utilize the system 
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Objective a Is met m part by the traditional method of 
piecemeal revision of existing sections, and although this is at 
present a t^pcffous and cumbersome process, some measures 
have already been adopted to nd the system of its hyperdemo- 
cratic procedure, which m the past often delayed urgently 
needed revisions for months and even year? As a results sev- 
eral hundred amendments and innovations in dozens of impor- 
tant subject fields have been introduced during the past few 
years, giving the lie to the often-heard argument that the UDC 
cannot cope with new developments in science and technology 
or in the social sciences (just the latter having undergone al- 
most complete revision and updating which is still not complete 
but has already resui^d in a much improved class 3) Another 
part of the revision procedure is the impending reallocation of 
class 4 (emptied more than ten years ago m order to accom- 
modate new subjects and overcrowded sections of classes 5 
and 6. but as yet not reoccupied). Various proposals were sub- 
mitted and discussed, and at present it seems that the following 
allocation of subjects has the best chance to be approved as a 
new class 4 schedule 

4 Man and his natural environment Material resourced Science 
and technology in general 

41 Man as an individual Medical sciences, anthropology, 
psychology. 

42 General biology, botany, zoology 

43 Agricultural sciences. Plants and animals 

44 Animal biology and husbandry (if 43 for plants and crops 
only). 

45 Mineral resources. Mining and mineral diessing 

46 Materials. Testing, sampling, etc 

47 Handling and transport of materials and persons 

48 Management business, household, etc 

Objective b is currently being met by a large number of new 
and revised full, medium-sized or abridged editions of the ta- 
bles in more than 20 languages (which, incidentally, now form 
the largest existing general multilanguage thesaurus of terms, 
where each language s linked to any of the others through the 
relevant UDC n'jmber, although much remains to be done m 
order to reconcile vocabularies and sometimes even the in- 
terpretation of certain UDC numbers n different languages, cul 
tures and political regimes) The most important among these 
editions which will hopefully be forthcoming withm the next 
couple of years is a new English-language edition 
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English Basic Medium Edition 

As mentioned above, the use of the iJDC in the English- 
speaking world has been seriously hampered by the lack of a 
usable English abridged or medium-sized edition. Plans are 
now being made to bring out a revised and updated Basic 
Medium Edition (BME) m English by the Central Classification 
Committee m collaboration with the British Standards Institu- 
tion (the body responsible for all English-language editions of 
the UDC). The publication of this edition will be the UDC's con- 
tribution to the Melvil Dewey Centenary m 1976. It will serve 
two basic purposes, (a) it will put at the disposal of UDC users 
the long av/aited comprehensive tables m English which could 
be used for most information retrieval work except where very 
fine detail of classing is needed (for which almost complete full 
tables are now available m English), (b) it will serve as the mas- 
ter file for the creation of other medium-sized editions m vari- 
ous languages and will be constantly kept up-to-date m the 
editorial offices by mechanized equipment At present, a com- 
mittee IS trying to determine the degree of abridgment from the 
full tables for every major subject field so as to assure a bal- 
anced presentation in the forthcoming BME. since critique had 
been levelled at the somewhat uneven allocation of detail m 
previous medium-sized editions that were published in German 
and French 



Index In Thesaurus Form 

It IS now generally recognized that a well-constructed 
thesaurus, using the standard relational devices of USE. BT. NT 
and RT. is a more flexible aid to the classifier than the conven- 
tional type of relative alphabetical index, and certainly much 
better than the mechanically produced one-line indexes of the 
German editions which are more in the natu''e of concor- 
dances A pilot project, undertaken by a group of Belgian ex- 
perts, resulted in the construction of a thesaurus-type alphabet 
cal index to part of class 33, economics, although this is a 
notoriously difficult field for any kind of index because of the 
vague and constantly shifting terminology of the discipline, the 
results are very encouraging and much superior to the type of 
relative index used up to now m the English and French edi- 
tions of the UDC 
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Subject-Field Reference Code 

Turning now from these projects which are still geared to the 
existing framework of the UDC to the more ambitious plans for 
future remodeling of the whole system, at least three ap- 
proaches have been made, and some tentative outlines have al- 
ready been published for discussion. 

The worldwide information and documentation network maugu 
rated by Unesco under the name of UNISlST recognized m its 
first report the necessity for an mternationaily applicable clas- 
sification system for recorded knowledge by means of a Sroad 
System of Ordering (BSO) and came to the conclusion that the 
UDC would be suitable for this purpose, although it might have 
to be substantially changed and updated: 

The use of the Universal Decimal Classification in 
particular . has been advocated. Its further potential has 
yet to be realized, and both a continuing programme to 
strengthen UDC and further studies and experiments to test 
its applicability to retrieval systems are desirable.^^ 

From the outset it was clear that BSO would not be as elabo- 
late as UDC but would rather be a much more general system 
serving primarily two functions (a) as a tool for broad indica- 
tion of subject fields and disciplines, (b) as a switching code 
for other retrieval systems (classification systems, including the 
' present UOC. as well as verbal indexing tools, such as subject 
headir>9 lists and thesauri) which could thus achieve a minimal 
measure of mutuahcomoatibility while still catering to the 
specialized needs of experts and practitioners m a particular 
subject field The^^^riginal name of the project was later 
changed to Standai^Reference Code (SRC), sometimes also 
referred to as a roof "code (the R m SRC then standing for 
roof) 

Initially there were some misgivings on the part of UDC experts 
arttf^4^sers that the new SRC. for the development of which FID 
had assumed responsibility, would virtually be the end of the 
UDC without necessarily resulting m a better or more useful 
tool (the properties, scope and actual application of the SRC as 
yet being m the realm of speculation), while the propKinents of 
the SRC were apprehensive lest the more traditional ideas in- 
herent in UDC would exercise an undue restraining influence 
on SRC This conflict was resolved by the formation of an mde- 
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pendent group of experts m FID who will deal only with the de- 
velopment of SRC (although some members of the group and 
its coordinator. Dr. I Dahlberg. are also UDC experts active in 
the formulation of more or less radical redevelopment 
programs^^ for UDC and conversant with the actual problems of 
document retrieval systems). 

One of the first actions of the group was to give the initials 
SRC again a somewhat changed meaning as Subject-field Ref- 
erence Code.' and to state that it would serve as 



A tool for interconnection of information systems, services 
and centers using diverse (often incompatible) indexing/ 
retrieval languages. 

A tool for tagging (i.e shallow indexing) of subject fields and 
sub-fields 

A referral tool for identifjcation and location of all kinds of 
information sources, centres and services.^s 



In early 1974. some 90 top-level subject fields had been iden- 
tified, and a more detailed list of a second and third level 
breakdown will be elaborated and discussed at meetings during 
1974. to be submitted for final approval at the forthcoming 
Third International Conference on Classification Research in 
Bombay m January 1975 At present, only a very rough tentative 
outline exists, so that it is impossible to assess the value of the 
system for its stated objectives as compared with other univer- 
sal systems (mcludmg the UDC which gave the impetus to the 
whole enterprise^ and to ludge whether the international scien 
tific community wilt be persuaded to use it, and if so. for what 
purposes, since the system is expressly intended not for the re- 
trieval of individual document* from any specific store, but only 
as a kind of idehtification system for the location of blocks of 
information (whatever that may be) and whole collections 
Whether the considerable effort expended on the construction 
of the SRC scheme will be justified by these rather limited and 
somewhat nebulous goals remains to be seen 
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UDC as a Universal Faceted Classification 

An elaborate plan for a thorough reform and revision of the 
UDC was submitted by A F Schmidt, head of the Classification 
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Committee at the German Standards Institution, who is respon- 
sible for the German UDC edition, m collaboration with J H de 
Wijn. who IS in charge of the Dutch UDC edition. To those 
familiar with the principles of UDC it might come as a surprise 
that a reform proposal should m fact only confirm what has ac- 
tually been the inherent nature of the UDC since its beginning, 
namely its basically faceted structure (devised long before the 
term^acefs had been corned by Ranganathan, who conceived 
of the Idea after having studied the structural features of the 
UDC), 

But while It IS true that UDC has always displayed facets in the 
form of its General auxiliaries and has indicated them by vari- 
ous nonnumerical symbols serving as facet indicators, this 
principle is countermanded innumerable times m the schedules 
themselves where the age-old method of simple decimal sub- 
division and enumeration, basically inherited from DDC. is used 
where the application of existing facets would not only be more 
logical (in terms of the structure of the system) but would also 
result in better and simpler retiieval Allow me to give just two 
examples. Anything connected with a country can and should 
be expressed by the geography facet (an elaboration of DDC s 
geographical subdivision device), e g where (73) is U S A . 
63(73) IS U S agriculture, 72(73) is U S architecture, etc . but 
the geography and history of a country are still mam numbers, 
VIZ . 917 3 and 973, exactly as m DDC This means that m m- 
verted files where documents can be grouped by the geography 
facet to give (73)63, (73)72 etc . the documents on the U S A 
are dispersed to at least three different and noncontiguous 
places (since the unfortunate interpolation of biography. 92. be- 
tween geography and history has also been retained m UDC) 
Another example is the classing of persons for which a quite 
detailed and generally applicable auxiliary schedule. -05, exists, 
thus we find 52-05. astronomers. 62-05. engineers. 681 11-05, 
watchmakers, etc But for some unexplained reasons, there are 
also many direct subdivisions for persons, such as 262 14. cler- 
gymen (359 8 military chaplains, is separate'), and 78 07. 
musicians, using a different special auxiliary. 07. to do the job 
for which the general auxiliary. -05 was devised 

The Schmidt-de Wi|n proposal is intended to put an end to 
these incongruities and to put the UDC on a truly faceted basis 
Without any exceptions thus cutting back substantially on the 
ever-growing numerical subdivisions which have become an 



ERIC 



88 



UDC 



84 



impenetrable undergrowth stifling the sound trees of the sys- 
tem. Their plan provides for three levels 

1 Superstructure 

2 Direct subdivisions 

3 Recurrent subdivisions 

The superstructure would consist of about 70 to 80 super- 
classes with a two-figure notation (which might coincide with 
the upper level of SRC), followed by classes with a three-figure 
notation and then four-figure subclasses, all structured by the 
present principle of decimal subdivision. Within any of these 
three levels, further subdivision would be possible by distinctive 
notational devices and the appendage of either recurrent or 
special subdivisions (corresponding roughly to the present 
general and special auxiliaries). Finally, a relatively large but 
still manageable number of Recurrent subdivisions would be 
developed, e.g.. concepts and features that cut across all disci- 
plines and are more or less applicable to all or most of them 
Again, this principle is not basically new. but would now be ap- 
plied much more consistently, freeing the mam numbers from 
ail unnecessary ballast and harmful duplication (as in the ex- 
ample of civil andvinilitary clergyinen. neither of which should 
be enumerated at ail but only indicated by a suitable person 
facet m the religion and military administration schedules re- 
spectively) 

The following general facets or recurrent subdivisions are 
proposed 

General features (e g abstract concepts time, relation, size 

quantity, quality, criterion, experience, etc ) 

Processes 

Actions 

Methods 

Energy and power 

Objects (matenals. persons as individuals and as groups 

products, documents) 

Languages 

Philosophies 

Cultures 

Cosmic and geographic units 
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Schmidt also proposes some changes m the use of symbols as 
signposts and relational indicators, and considers that a UDC 
restructured along these lines would be even more amenable to 
computerized information retrieval than the existing form 
which, as already pointed out. has proved to be computer- 
compatible when relatively small adjustments were made, or 
even where the existing tables were used without any change. 
The authors of this proposal hope that their plan would provide 
not only the "roof" for SRC but also the "pillars" to bear it. i.e., 
the necessary substructure of more detailed indication of 
document content needed m actual retrieval situations which 
no doubt are of more importance to individual researchers than 
an internationally standardized code for blocks of information. 



NUOC 

A similar approach to the restructuring of UDC. a New UDC or 
NUDC. IS taken by the Czech classificationtsts D Simandl and 
L Kofnovec.27 who also take the need for a SRC as their start- 
ing point but proceed to develop a methodology for the con- 
struction of a revised UDC rather than proposing a new struc- 
ture as such. Their approach is based on the relative impor- 
tance of subjects as indicated by the volume of literature gen- 
erated in various fields (based on an analysis of abstracts m the 
Soviet abstracting journal ReferaUvnij 2urnal and in other ab- 
stracting and indexing services), it results m the following 
rough percentage breakdown of the fields of knowledge 



Technology 35% 
Chemical engineering 10% 
Electrical and mechanical engineering 10% 
Natural sciences 30% 
Chemistry and physics 10% 
Earth sciences 5% 
Medicine and agriculture 15% 
Social sciences and humanities 15% 
Others . 5% 



The authors also provide a more elaborate table m which the 80 
or so superclasses of a possible SRC are assigned two figure 
notations and which snows a suggested breakdown of these 
main subject fields which is substantially different both from 
the one to which we have become accustomed m the 
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10-main-class framework common to DDC and UDC and from 
the one suggested by the SRC committee. This means that the 
biggest difficulty in the design of a new scheme seems to be 
the lack of consensus among various groups of experts on 
what constitutes a super class' or mam discipline It remains 
to be seen whether the Simandl-Kofnovec methodology can be 
fruitfully amalgamated with the Schmidt-de Wijn structural ap- 
proach, and whether the final product of these endeavors will 
lead to a universal retrieval system which can cope better with 
rapid changes and innovations through inherent reliance on 
basic building blocks rather than on ad hoc additions and 
amendments to an inherently rigid and outmoded framework 



Conclusions 

This necessarily brief survey of the most recent developments 
in the complete revision or reform of the UDC shows clearly 
that there is still quite some life in the old tree whose roots go 
back more than one hundred years if we trace its ancestry to 
Dewey s scheme, first conceived in 1873 and published m 1876 
It IS also evident that the UDC is still a truly international 
scheme, with people in many countries contributing to its 
further development. These are by no means the isolated efforts 
of starry-eyed idealists, but constr'ictive attempts made by ex 
perts and backed by national and international organizations 

Now that the initial euphoria of the early 1960s concerning the 
use of computers m information retrieval has evaporated, and 
the subsequent infatuation with slightly spruc6d-up subject 
heading lists under the grandiose name of thesauri has been 
replaced by more sober assessments of the requirements for 
construction and utilization of these and other retneval tools- 
all of which rely in the last analysis on classificatory prin- 
ciples—there IS indeed room for a new appraisal of existing 
classification systems and their restructuring m the light of 
both theoretical and practical insights gamed over the last 
quarter of a century The UDC, so often declared to be dead 
^especially by those who did not know the purportedly de- 
ceased in life) will probably have to play an important role m 
these future developments towards a truly universal and inter 
national retrieval code, even though the phoenix to arise out of 
the ashes may have little outward resemblance to its venerable 
predecessors 
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Automatic Classification: 
Directions of Recent Research 



Zandra Moberg 



As the pervasive cor^puter technology has come more and 
more to dominate many aspects of library science, directions of 
change in librarianship have been largely determined by this 
technology. "Library science" has been broadened to library 
and information science," symbotizir^g the unlikely union of 
computer scientists ai;id electrical engineers, on one hand, and 
humanistic librarians dn the other on common professional 
ground. 

In classification this synthesis has been manifested m extensive 
investigations into computerized classification. At first these 
new systems, alluding as they do to clumps and passes and 
thesaun and algorithms, seem to bear little resemblance to the 
subdivision-of'knowledge approach encountered m philosophy 
and in traditional library Science classification. It is gratifying 
for librarians to note, nevertheless, that W. C. Berwick Sayer's 
classic definition, which ^pecifies only that the arrangement of 
items be use/u// pinpoints the ratson d'etre of all automatic 
classification attempts, and is, even with the further refinement 
that things be assembled in an order of likeness.^ comprehen- 
sive enough to include them all 



Justification of Research 

Impetus for research and development m automatic classifica 
tion has grown from very practical considerations Little re- 
search IS undertaken in any field from purely theoretical 
interest— someone must be willing to pay for it. and this implies 
that the results should have potential value m application Au 
tomatic classification is needed for at least two important 
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reasons. The first reason is utility to the user. Precision and re- 
call from automatic indexing alone leaves a great deal to be 
desired and classification can be used to expand the queries 
and thereby to retrieve more relevant documents. 

An even more compelling justification for automatic document 
classification in obtaining financial backing for research has 
been economics, searching only one section of a classified 
document collection, especially a very large one, requires 
dramatically less computer time as well as human time. The 
goal of research should be to produce effective, practicable 
classification by computer rather than manually, given the 
geometrically increasing volume of information to be proces- 
sed. 



Automatic Classification Arrangements 

it should be emphasized that automatic classification" may 
refer to two distinct kinds of arrangements, it may be a scheme 
for the grouping of index terms or tt may mean classifying the 
documents themselves Investigators m the sixties worked on 
systems of one kind or the other. The names of Cleverdon. Sal- 
ton. Lesk. Needham. and Sparck Jones are cited repeatedly for 
work on the former, significant contributions to research on the 
latter were made by Borko. Doyle. Rocchio and Dattola. A cur- 
rent trend seems to be to coordinate tfie two kinds of classifica- 
tion into one system 

Another basic dichotomy m classification, which has become a 
dominating issue posed by the use of computers, is that oi^ 
Semantic classes versus statistical, mathematical classes. Com- 
puters manipulate and store symbols quantitatively, but classes 
need to have appropriate names for communication of meaning 
between people ^ Acceptance of this premise seemed to mean 
that classes must have semantic unity, which pointed the way 
to experiments m linguistic analysis 



Semantic Classification 

Language analysis by computer, with a view to assignment of 
linguistic symbols to documents as content and class iden- 
tifiers, has been one of the projects of Gerard Salton s SMART 
system SMART the most sophisticated and elaborate informa- 
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tion storage and retrieval system in the United States, was de- 
signed at Harvard between 1961 and 1964. Operating at Harvard 
and Cornellp SMART has been supported by the National Sci- 
ence Foundation/ Under the aegis of computer scientist Sal- 
ton, SMART is an experimental automatic system which serves 
as a testing grpund for many ideas in information storage and 
retrieval, language analysis being conspicuously among them, 
in the sixties. 

The language analysis experiments on SMART employ manual 
intervention in the processing of documents in that the thesaun 
are constructed manually. This means that judgments on which 
terms are to be classed together are made for SMART by peo- 
ple, not by machines, and that, strictly speaking, this is not au- 
tomatic classification. However, it is not always logical or prac- 
tical to separate indexing from classification, Salton's findmgs 
with the manual thesauri and Saiton's thinking have influenced 
directions pursued by subsequent researchers, and theoretical 
approaches outlined by him will be seen to have been carried 
out by others working on automatic classification. 

The thesaurus entries in the SMART system constitute the 
classes of a keyword classification which was constructed by 
subject experts and committees of subject experts. The 
synonym thesaurus, or dictionary, builders determined which 
words and phrases would be important content identifiers for a 
given subject area, they were required to come up with all pos-^ 
sible words or word combinations, so that the computer could 
be programmed to recognize them. There were separate dic- 
tionaries for different fields. There was also a suffix dictionary, 
listing approximately 200 English suffixes, a statistical phrase 
dictionary, a syntactic phrase dictionary, the usu^l negative 
thesaurus, word stenri dictionaries, and concept hierarchies 
constructed specifically for different fields. These dictionaries 
were used in extensive experiments in semantic analysis em- 
ploying several hundred automatic content analysis methods.^ 

To give an idea of the complexity of some of the procedures, in 
one — the syntactic phrase dictionary— each syntactic phrase 
entry consists of a specification of component concepts, syn- 
tactic Indications, and syntactic relations permitted between 
concepts, alt indicated by numbers. There are four possible 
basic kinds of syntactic indications, each being divided into 
twenty syntactic types. Syntactic dependency types are expres- 
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se,d In the form of syntactic dependency "trees," vertical dis- 
placement along a given path of the tree denoting syntactic de- 
pendency. Parts of speech are also prescribed.^ Despite the 
enormous amount of vyork that must have gone into this 
thesaurus and Associated algorithms, syntactic phrase analysis 
is not mentioned in the detailed evaluation studies and may be 
presumed to have been abandoned. 

The dictionaries entailed great expenditure personally, as well 
as monetarily, 1t would seem: 

... the task of constructing a subject dictionary ... is one 
which demands many skills, including a great deal of persis- 
tence and tenacity a committee is often appointed to 

thrash out controversial questions which frequently ends by 
satisfying no one any saving which might result from au- 
tomatic search and retrieval methodology might be promptly 
lost through the elaborate preparations required to build 
dictionaries.^ 



One can infer the Sturm and Drang on the thesaurus commit- 
tees. Salton and Lesk concluded that automatic or semiautoma- 
tic dictionary construction is imperative, above all. "to eliminate 
the human element"!^ Furthermore, in the exhaustive retrieval 
evaluation studies the performance of language analysis to 
characterize documents using these thesauri is not as effective 
as expected.^ with the exception of the regular synonym dic- 
tionary. The numerous dictionaries, the hundreds of methods 
and countless runs on SMART for language analysis are expen- 
sive. Although the experiments are designed to meet the first 
practical need stated above— utility to the user by provision of 
more effective retrieval tools— Salton recognizes that ultimately 
cost is a consideration of overriding importance.'^ 

Salton's current thinking on language analysis is that linguistics 
does not have much of a role to play in information retrieval, 
having established once and for all. it would seem, that linguis- 
tic analysis by computer to characterize documents is not a 
fruitful avenue of investigation. 

If automatic syntactic and semantic analysis has proven a dead 
end for classification, it is probably due to the nature of lan- 
guage Itself and not because of inadequacies in anybody's al- 
gorithm. Salton and Lesk state it rather strongly: "... no 
human Intermediaries exist who could resolve some of the am- 
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biguities inherent in the natural language itself, or some of the 
inconsister^cies introduced into written texts by the authors. ''^ 
They leave it unclear whether or not they meant at that time to 
suggest that they thought machines ought to be able to resolve 
those ambiguities better than the subjective human inter- 
mediaries. It is an accepted premise of linguistics that the set of 
words in a given language is infinite and that the possible 
combinations of symbols are similarly infinite,^^ which presents 
a formidable challenge even to th^ most sophisticated 
software-hardware combinations. 



Mathematical Keyword Classification 

Salton and Lesk have turned from dictionary construction to the 
consideration of automatic methods of mathematical keyword 
classification. Their term-document matrix association procedure 
is based on the work of Doyle and others, they suggest that term 
association process be applied to the matrix to achieve thesaurus 
groups. In 1966, Salton and Lesk stated prophetically that "it 
would be nice if it were possible to give some generally applicable 
algorithm for constructing hierarchical subject arrangements,'""^ 
and went on to outline possible methods of automatic and 
semiautomatic methods of hierarchy formation based on keyword 
co-occurrence and resulting in a classification free-~a technique 
later developed and applied in the system of the Moore School at 
the University of Pennsylvania. *- 

While Salton s experiments assumed that membership in a com- 
mon class means that words must be related semanticaliy, Karen 
Sparck Jones and others at the Computer Laboratory at the Uni- 
versity of Cambridge have bypassed the foregoing difficulties 
with language by creating a purely statistical keyword classifica- 
tion system. We cannot ask direct questions about the meanings 
of words if we are using automatic techniques," states Sparck 
Jones flatly in her monograph on automatic classification, but in 
a mathematical keyword classification it does not matter, for if 
two words always co-occur In a given set of documents they are 
necessarily able to be substituted for one another in retrieval, and 
it makes no difference whether they are semanticaliy or concep- 
tually related or not.^* 

Acknowledging previous work on statistical association in 
classes by Doyle, Cleverdon, Borko, Needham, Salton and others. 
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Sparck Jones has devised a series of controlled experiments to 
test the effectiveness of selected automatic keyword classifica- 
tion methods. Funded by the British government, the project has 
attempted to describe logically systematic comparisons of the 
recall and precision measures achieved by these methods. 

Any mathematical classification using a matrix is a two-stage pro- 
cess, Involving, first, the construction of a similarity matrix for 
object pairs based on co-occurrence of terms in documents. The 
class-finding procedure is then applied to this matrix to Identify 
groups of similar terms. Four different routines for constructing 
« the matrix and four group-finding procedures have been com- 

pared by Sparck Jones; strings, stars, cliques and clumps, so 
named for the kinds of links between elements of the classes. 
Runs were made using different combinations of each of these 
variables, comparing different values of a given parameter in a 
base environment with a view to drawing conclusions about the 
nature of a "good" classification^® 

Sparck Jones has found, first, that automatically obtained term 
classification does give better retrieval than unclassified terms 
alone, which means that automatic keyword classifications are 
worth constructing and that the means of constructing them on 
a large scale is readily available by computer. Her results also 
show that the choice of similarity definition on the matrix does 
not affect retrieval performance very much and that the choice 
of class firiding procedure also does not matter very much- 
strings, stars, cliques and clumps all did about the same. One 
constant did become apparent. Whichever of the four class de- 
finitions was used, restricting the class to a very strongly con- 
nected set of elements gave noticeably better retrieval 
performance.^^ 

Another striking finding was that restricting the vocabulary of 
the classes by excluding the very frequent terms, treating them 
as classes unto themselves, promoted higher retrieval perfor- 
mance. And Sparck Jones found, contrary to expectation, that 
higher recall was accompanied by higher precision values, at- 
tributable to the fact that the classe.s were not mutually exclu- 
sive, and the context of one class or another defined more 
sharply the nomographic words.^^ Her thorough evaluation in- 
cluded external comparisons in which she found her best re- 
sults to be roughly comparable to Salton s with his best manual 
regular thesaurus.^^ 
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Discussing the necessity of updating a collection to take into 
account new documents, since addition of each new or** could 
create new patterns of co-occurrences of terms, Sparck Jones 
has acknowledged the pervasive problem of control of very 
large collections and the last sentence of her monograph 
points out the direction being taken by current investigations, 
the use of automatic keyword classification in conjunction with 
automatic partitioning of a collection into units (document 
classification) for searching.^© 

In later research on term classification. Sparck Jones found the 
villain In low retrieval performance sometimes to be the supply 
of terms themselves, between which strong term connections 
could not be formed because relevant documents had not been 
separated from nonrelevant ones.^' This provides strong sup- 
port f^r the notion that automatic keyword classification will 
work best if the documents are classified or grouped by like- 
ness and that the two kinds of automalic classification are in- 
deed complementary. -^-^^ 



Clustering Techniques 

Extensive work was done on SMART on automatic mathemati- 
cal document classification in the late sixties, namely the clus- 
tering techniques worked out by Rocchio and Dattola with a 
view to shortening search time.22 Groups, or clusters, were 
created by correlating documents on a matrix according to 
keyword co-occurrence. One representative document descrip- 
tion, called the centroid vector, v/as generated to represent all 
the documents In a given cluster, being a ranked list of the 
most frequently occurring index terms in the cluster. Queries 
were m¥tched initially against the centroid vectors of the group 
and then against selected pertinent documents within the 
group. Standard precision-recall evaluations were carried out 
and investigations on questions pertaining to optimum cluster 
size, amount of overlap between groups, query clustering, and 
how many documents to allow unclustered were performed. 
The SMART investigators were able to conclude that cluster 
searching appears to offer large savings in search time, at no 
substantial loss in recall and precision, for all searches not re- 
quiring either a very high recall performance or a very high 
precision." Generally, more clusters with fewer documents In 
each gave better precision and recall. 
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A new wrinkle in current investigations into automatic classifi- 
cation is automatic hierarchy construction. The document clus- 
tering methods are potentially hierarchical, and Salton sug- 
gested at one time a multilevel search procedure by grouping 
the centroid vectors themselves into broader anc broader 
groups for expanding the search.^^ This proposed multilevel 
procedure is based on a principle to be worked out in others* 
algorithms, that the breadth or generality of a subject class is a 
function of how widely and how frequently member index terms 
occur. Salton's idea was depicted as wider and wider circles in 
a document space to indicate the levels.^^ 



Text Organizing System 

Production of a total system for processing data bases incor- 
porating advanced techniques in information storage and re- 
trieval into one practicable package has been the goal of the 
Text Organizing System at the Moore School, University of 
Pennsylvania, which was put together during the past ten years 
for the U.S. Office of Naval Research by a group including 
Prywes, Lefkowitz, Litofsky, Kdymon and others. The work is 
still in process. In contrast to the testing ground function of 
SMART or Sparck Jones's program package of an orderly 
series of controlled tests with concurrent evaluation studies, 
the Text Organizing System is the product of applied research, 
intended for use with specialized or private data bases.^^ 

The unifying component of this system is a classification al- 
.gorithm, cal)ed CLASFY. contributed by Litofsky,^* which suc- 
cessively subdivides items to create a hierarchy or classification 
tree, based on occurrence of index terms assigned to the items 
until do.'^ument groups or keyword groups of the desired size 
are obtained. In the first step of the process candidate index 
terms are extracted from the text automatically. These are then 
selected manually for eventual use in the classification al- 
gorithm assisted by printouts from the computer of word fre- 
quencies and similarly spelled words which are used as guides 
for reducing the term vocabulary. The resulting directory oi 
index terms determines which terms shall represent the docu- 
ments In which they are contained. The classification algorithm 
is then applied to the documents represented by the index 
terms, successively subdividing the collection hierarchically, to 
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produce a reordered data base, arranged in an order reflecting 
similarity of groups. Documents adjudged to be similar are al- 
located to common "cells" of approximately equal size. Moving 
up the tree, index terms showing content of document groups 
beneath them are indicated at the nodes of the tree, the termi- 
nal nodes being the cells, the actual location sites of the 
documents.29 

The classes at these nodes on the tree are denoted by num- 
bers, and document descriptions are built by combining, in 
order, the numbers of the successive nodes under which the 
document is grouped to produce the document's canonical 
classification number.^^ As in Salton and Lesk's manually con- 
structed concept hierarchy tree, each successively higher node 
is assigned one more digit, the number farthest to the left rep- 
resenting the most general (i.e., frequent) class, with numbers 
representing the more specific (less frequent) classes toward 
the right. The class, or node, numbering system is thus the 
numerical notation of a synthetic classification scheme, the no- 
tation being built up from a hierarchy of node numbers repre- 
senting mathematical classes derived from frequency statistics 
of index terms. In Salton and Lesk's hierarchy, it will be re- 
membered, the classification numbers symbolize words and 
concepts; in the Text Organizing System, the synthesized clas- 
sification number classifies a document. 

CLASFY is a three-step algorithm which partitions the collec- 
tion into more groups each time it is applied, each level in the 
tree resulting from another application of the subdivision pro- 
cess. In the first pass just the keywords of the collection or 
subgroup are partitioned; the next two passes assign docu- 
ments on the basis of matching similar keywords in them to 
one of the groups. The algorithm continues to be reapplied 
until groups of like documents reach specified optimum size.^^ 
The more unique (to the collection) terms a document contains, 
the farther down the tree it will be. 

The tree describing the entire collection is made available to 
the user In two directories, the key-to-node directory which lists 
index terms along with all the documentanonical classification 
numbers to which they have been assigned; and the node-to- 
key directory lists all the node numbers with assigned 
keywords. Facsimile tables of these directories are manually 
searched in printout or microfilm form, a first step in retrleval.^^ 
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The final step of the integrated text processing in T. 0. S. is the 
creation of a keyword classification by means of the same sub- 
division algorithm. The keyword vocabulary is successively 
subdivided in,to mutually exclusive sets of keys on the basis of 
classification numbers assigned to each key in the document 
classification process— the more documents a term appears in, 
the higher up in the tree it will be. The product of the 
mathematical keyword classification is an Affinity Dictionary, 
which is intended to be used both to expand a search by iden- 
tifying interchangeable terms, and to consolidate terms in up- 
dating the system.33 

Like Sparck Jones's thesaurus, the Affinity Dictionary is an au- 
tomatically derived keyword classification, the difference being 
in the mathematical techniques. Any comparison of retrieval 
success is not possible because the keyword classification 
component of the Text Organizing System can not be isolated 
out from the total system, and because no recall precision val- 
ues are available on it as yet. It is of interest to note that the 
divisive, hierarchical T. 0. S. algorithm achieves one condition 
noted above which Sparck Jones's tests demonstrated to be an 
important factor for success: that frequent terms should be 
classes unto themselves and that less frequent terms should be 
grouped. This Is exactly what CLASFY does with keywords. 

In the absence of retrieval success figures^ which are still being 
worked on at the Moore School,^ the authors use a rather 
curious criterion for evaluating their document classification 
scheme. "... the quality of a classification system is measured 
by how well it minimizes the average number of keys, per 
cell,"35 j the fewer keywords characterizing a class of docu- 
ments, the more alike they must be. 



Hoyie'8 integrated System 

Meanwhile, W. G. Hoyle at the National Research Council of 
Canada has recently developed an integrated indexing and 
classification system similar In many respects to the Text Or- 
ganizing System. The automatic indexing procedure is the basis 
for automatic generation of a classification scheme, documents 
are assigned to categories on the basis of keyword occurrence, 
as in CLASFY. Hoyie's procedure does not involve reapplica- 
tions of the divisive algorithm to create a hierarchy but he does 
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^ suggest the possibility of "super categories ' of keywords by 
this method. A specialized thesaurus, analogous to the Affinity 
Dictionary, is also a last step of the process. Hoyle's method, 
however, does not partition the collection into mutually exclu* 
sive groups, but produces rather an "ordering of relevance/' 
Further, the ordering employs weighing of terms, which the 
Test Organizing System does not. Comparing his document and 
keyword classification system to a manual one using the same 
material, Hoyle found "reasonable resemblance."^® 



Van Ri]8bergen*s Hierarchical Clustering 

Another avenue to reducing computer time in retrieval by scan- 
ning only a subset of a classified collection is hierarchical clus- 
tering, called cluster-based retrieval. Research on this approach 
has been reported recently by van Rijsbergen at Cambridge, 
with "helpful comments and criticism" by Karen Sparck Jones. 
Document classification follows logically from Sparck Jones s 
findings on the importance of the collection properties, this 
method arranges the collection in a hierarchic system of clus- 
ters. The clusters are obtained by a single-link cluster method 
applied to a dissimilarity matrix to generate a stratified hierar* 
chy of clusters.^' 

In the Text Organizing System, it will be remembered, the 
documents are located only at the tips of the branches of the 
tree, and Saiton's suggestion for multilevel search with clusters 
created hierarchy by using the centroid vectors or keyword 
groups. Hierarchic clustering is innovative for arranging the 
documents themselves hierarchically. Evaluated for retrieval ef- 
fectiveness, this technique was reported cautiously by van Rijs 
bergen as being "quite competitive."^' 



The Mathematical Theory of Hierarchy 

It may be observed that several of the automatic classifications 
described have Included or hinted at hierarchy formation. 
These mathematically obtained hierarchies are formed on a dif- 
ferent basis from hierarchies heretofore encountered in library 
science. Hierarchy has had a common meaning of a division of 
knowledge into progressively narrower classes which necessar- 
ily bear a generic-specific relationship to one another. The new 
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automatic hierarchies are based on the occurrence and co- 
occurrence of words in documents. Words occurring less fre- 
quently, in fewer documents, determine a lower position in the 
hierarchies, more frequent terms appearing in a wider range of 
documents are indicators of a higher point, ihus genenc is re- 
placed by more frequent and specific comes to mean less 
frequent SaltOn propounded the new mathematical theory of 
hierarchy in 1966: 

. . . there seems to be some relationship between the fre- 
quency of occurrence in a given collection and its place in 
the hierarchy. More specifically, those concepts which ex- 
hibit the highest frequency of occurrence In a given docu* 
ment collection, and which by this very fact appear to be 
reasonably common, should be placed on a higher level than 
those concepts whose frequency of occurrence is lov/er.^^ 

It is again gratifying for librarians to note that the profession 
has not rested on theory at variance with a reality being trans> 
formed by automation. At the Elsinore Conference on Classifi* 
cation Research in 1964, classification was defined as 'any 
method creating relations^*^generic or otherwise, between indi- 
vidual semantic units, regardless of the degree of hierarchy 
contained in the systems and whether. . . with traditional or 
more or less mechanized document searching.'*^^ a detailed but 
flexible definition which* will accomodate any ilk of hierarchic 
or nonhierarchic classification based on word frequency and 
^Jistribution figures. 

Conclusions 

In all of the above-reported research, in which methods have 
been sought to generalize classification from indexing, there 
has been implicit recognition of the inextricable relatedness of 
indexing and classification as two facets of one process. No au- 
tomatic system should be expected to retrieve satisfactorily 
from a system of document description in which the classifica- 
tion bears no relation to the index terms as is sometimes the 
case in the Library of Congress system. This emphasis on 
coordination of thelvvo is a less obvious theoretical contribu* 
tion of automatic classification research. 

Research will continue to be earned out and supported if for no 
other reason than the economic one of recfucing computer time 
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on large collections. Richmond's prognosis for automatic clas- 
sification in the most recent /^nnt/a/ Review of Information Sci- 
ence and Technology is dim because "its utility as a satisfac- 
tory means of document representation has not beeq demon- 
strated except for small, homogeneous collections in well- 
defined subjects of narrow scope"^' (probably an allusion to 
Sparck Jones's findings). This statement can.apply to keyword 
classifications only, document classification aims at the crea- 
tion of just such ''small homogeneous collections in well- 
defined subjects" within the larger collections. And within such 
homogeneous document subgroups, {etrieval success with 
keyword classification can be expected to be optimal. What 
seems to be called for at this time in management of large col- 
lections is coordination o( keyword classification and document 
grouping, which Is what the Text Organizing System of the 
Moore School and others have been attempting lo do. And, ' 
since the ultimate basis for existence of information systems is, 
to go back to Sayers, utility to the user, interest will not be ex- 
pected to wane in the refinement of systems to approach the 
elusive goal of high precision and recall. 
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About one hundred years ago. the British philospher. William 
Jevons, disrnissed classification as a "logical absurdity". His 
view has not been an undsual one. Some of the more interest- 
ing adventures in information science, such as Mortimer 
Taube's Unlterms, stemmed fVom an irremediable loss of faith 
in classification as a way or organizing knowledge. Further- 
more, Kurt Goedei's now famous proof pulled thQ rug out from 
under^stems based solely on deductive logic, so thfat one was 
left with the necessity of organizing knowledge on some other 
basis. 



in^ effecting change to a new basis, two areas were obviously 
wide open for investigation. The first was words^index terms, 
descriptors-— the path taken by Taube and others. The second 
area was Inductive logic— classification systems built from the 
ground up primarily. The most notable of these have been the 
faceted classification systems, but they are not tht only ones. 
Maps, graphs, patterns, statistics and probability theory have 
been invoked either to show relationships or to find them. 

The area of words has had considerable attention for some 
time. The great weakness of systems! based on words turned 
out to bo the very ridhness of language. Part of the problem 
Was the willingness of creative minds to employ old words with 
new meanings— a practice Robert Falrthorne deplored as words 
"used in public with private meanings."^ The effort to escape 
classification by means of verbiage alone was not an unqual- 
ified. success, nor was iy a total failure. The successful de- 
velopment of thesauri quring the past fifteen years has shown 
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that a controlled vocabulary can be quite useful in a 
homogeneous subject field, even though it has minimal das- 
^ sificatlon features— minimal meaning to about seven levels.^ 
Subject headings, descriptors, index terms, unit terms and 
many other kinds of terms are still very much with us. 

Methods of linguistics, particularly computational linguistics, 
are still being investigated as a means of describing 
knowledge.^ Machine translation is currently in abeyance 
though it may not remain so. Some methodologies originally 
developed for syntactic analysis, content analysis and similar 
processes are viable, as is Gerard Salton's ultrarefined little 
SMART system at Cornell.* 

The problems of definition of words remain. Relationships be- 
tween concepts for which words stand and the very sources 
and uses of words themselves still need much more research. 
Interest in relationships has led from word lists to thesauri to 
attempts at mapping by means of directed graphs and other 
similar devices.* The mapping has reintroduced a factor of 
classification Into the subject analysis process, just as see also 
cross-references Inserted a measure of classification into sub- 
ject heading compilations. A less semantically-oriented type of 
relationship study has produced little classification systems by 
means of some of the techniques of applied mathematics. Wil- 
liam Goffman's "indirect method" is an example of this.* Al- 
though it has been used deliberately for word classification in 
only one publication so far,^ it has some interesting 
possibiiities.* In sum, one may say that in the area of words, as 
an alternative to classification, the trend has led right back to 
classification. 



The area of inductive logic was partly the foundation of faceted 
classification, but here again it was not all clear sailing. The 
problem of relationships turned up again, and Jason Farradane 
has been emphasizing this factor for over twenty years.^ Derek 
Austin, who was engaged to work full time on a New General 
^ Classification for the Classification Research Group in London, 
found the relationship factor so highly significant that, while he 
did not complete the New General Classification, he made use 
of the.relationship-in-classification idea to the extent of de- 
veloping it In his PRECIS system.^^ PRECIS terms, among other 
things, carry enough of their context with them to be a minia- 
ture, multipje-entry classified index to a given title. 
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Continuation of Present Work 

The ide^ of a New General Classification has not been given up 
in Loncfon. Regular meetings are still being held and the prob- 
lems of classification still retain their interest.^^ Classification is 
by no means a dead issue., 

For the immediate future, one may expect continuation of pres- 
ent work. This includes the Depth Classification schedules 
being produced at the Documentation Research and Training 
Centre in Bangalore, an ongoing effort that already has pro- 
duced well over fifteen schedules on subjects mostly, but not 
entirely, scientific and technical. Statistical and probabilistic 
methodology will certainly continue to be applied in the attempt 
to make automatic classification systems. Karen Sparck Jones, 
in particular, has been persevering in this direction.^^ Modern 
mathematics has areas, notably in topology, which may be ap- 
plicable to classification. Some attempts have been made 
along the line of physical models for three-dimensional classifi- 
cation in an attempt to improve the visualizing process begun 
In mapping class relationships via directed graphs and the 
like.^^ At the moment, this work is primarily a teaching tool. 

Looking toward the future. UNISIST, with the cooperation of the 
International Federation for Documentation, has set up a Work- 
ing Group to prepare the backgroxind for and to begin to de- 
velop a Subject-field Reference Code (SRC). This code is de- 
signed to produce a Broad System of Ordering, which is "a 
mechanism for shallow indexing, whose goal is to locate and 
transfer large blocks of information, rather than specific docu- 
ments or data, between different discipline and mission- 
oriented systems, using, eventually, different natural 
languages."^^ This broad classification scheme is to be univer- 
sal in scope, flexible enough to keep up with changes in the 
fields of science and technology, easily updated, simple in 
structure so that it can be adopted and adapted inexpensively, 
and usable for both manual and computerized systems. These 
noble goals are not entirely new, but it will be very interesting 
to see what trahspires. One has the impression of deja vu, bi/t 
hope does spnng eternal and within the given parameters it^' 
may prove possible to produce such a scheme. \ / 

Meanwhile the traditional forms of classification will contmUe. 
expansions of the Universal Decimal Classification, phoenix 
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schedules in Oewey and new editions of the Library of Con- 
gress system. Various suggestions have been made to get more 
mileage out of these various schemes by using a variety of 
techniques, such as merging the subject headings, class de- 
scriptive terms and index terms for the Library of Congress and 
Dewey systems, with rotation and permutation of the individual 
words Involved, in order to offer the ijser more access points.'^ 
John Immroth has made a study of the Library of Congress sys- 
tem in this respect and has developed a means for chain- 
indexing some of the classification.^^ His method appears to 
work better with those parts, notably in literature, which have 
some degree of hierarchy in them. 

influence of New Factors'on the Scene 

in PRECIS, in automatic keyword classification, In the SMART 
system and others, the computer has been introduced as a ^ 
convenient tool. This gadget promises to become as common- \ 
place as the telephone. Currently, there is a hand-held calculat- 
ing machine which can be programmed with cassettes, thus 
giving the user the option of staying in his office and getting as 
much or more computer power than was available with early 
machines of the first generation. The computer has already 
,been applied to classification in a project covering the whole 
middle edition of the Universal Decimal Classification (in 
English). The computer can be used as a sorting device with 
the notation of almost any classification system. It is used for 
printing and maintaining the Library of Congress subject head- 
(ng system. It Is the mainstay of all automatic classification at- 
tempts and a good many indexing ones as well. 

For comparative classification studies, It is helpful because it 
can dredge up more materials for study in an hour than one 
could get in a year by manual means, assuming machine- 
readable text, and one can work with total data Instead of sam- 
ples. The Intellectual aspects of comparative classification, 
however, are not amenable to mere computer manipulation be 
cause of the varying theoretical foundations of the more com- 
mon classification schemes. 

The computer can do a lot more the^n just count. Even a brief 
look at the usages to which it has been put m the humanities 
(other thart for concordance-making) will indicate the variety of 
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methods that have been used to deal with masses of historical 
data, literary text, archaeological fir^dings, architectural render- 
ing possibilities, and so on. The cleverness with which the 
computer has been assimilated into the methodology of the re- 
search scholar is impressive. One can, for instance, use a quan- 
titative history approach to reveal unsuspected occurrences for 
which the answer must then be sought by more conventional 
means.^9 As machine-readable data bases, such as MARC, be- 
come available arid access to such bases via console spreads, 
It will be possible to replace one classification with anbther au- 
tomaticatly, though the results may not be yery satisfactory be- 
cause of the variation in systems and also because a one-to- 
one correspondence between classes is present less than half 
the time. 

It is much more likely that a totally new classification might be 
adopted if a satisfactory one comes along. This is probably one 
thing that keeps th^ Classification Research Group going. The 
Bliss Bibliographic Classification and the Colon Classification 
were hardly touched because they arrived on the scene when 
the three major systems were already entrenched. In big li- 
braries, total reclassification, as a rule, has only been done 
under dire necessity, as, for example, at Cornell in 1947 when 
the old homemade system had virtually collapsed. 

With centralized processing and computerized distribution, the 
adoption of a new scheme, while still a major problem, would 
not be an unthinkable one. Therefore, classification research 
and the search for a new general system are no loflger the 
knowledge-for-knowledge's-sake undertakings that they have 
appeared to be for thQ last fifty years. 

In addition to the computer, there is another major factor on 
the scene which eventually may affect classification. This is the 
concept of the wired city. The combination of cable television, 
the telephone and the computer can bring into any subscriber s 
home a variety of services. Some of the projected configura- 
tions sound likd Big Brother, especially as there will be 
changes in who controls access to information, not to mention 
who has access to information and when. On the other hand, 
the conveniences of the wired city would be great time and 
energy savers. Classification might welj be involved in the in- 
formation rotrieyal aspects of such a system. One can conceive 
of browsing in a classified catalog when one dialed up the local 
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library. When specific information was sought, data could be 
displayed on the TV screen and the reference interview could 
be conducted over the line, so to speak, instead of over the 
counter. 

A third new factor on the scene is the concept of the informa- 
tion utility. "A utility can be defined as a system providing a 
relatively undifferentiated but tangible service to a mass con- 
sumer group and with use charges in accordance witji a pricing 
structure designed for load levelling. "20 Normally on0 would not 
consider the library in this category. However, a step in that di- 
rection has been taken with the formation of networks ^imong 
libraries in order to share the cost of delivering bibliographic 
material to cooperating institutions. The middleman rather than 
the user pays the cost. Information centers and commercial 
services which serve the paying customer are much closer, but 
they are still independent -agents, not a utility. If the whole lot 
were thrown in together, including the agencies that produced 
the information-product, and the user paid according to his 
needs, one would have something closer to a utility. 

Perhaps the librarians nearest this concept are those who 
argue that it would make sense to contract out for all or some 
of the data services based on machine-readable data bases 
rather than to try to maintain them in the library where they 
would only be partially used at great overhead cost. A resident 
bibliographer would act as agent between the would-be user 
and the vendor to ensure that the best possible connections 
were made.^^ The user would ultimately pay the selected ven- 
dor. Veaner has suggested, for example, that cataloging ser- 
vices might be "purchased totally from a vendor or obtained 
from his resident staff, much as computer centers buy 
specialized expertise through the 'resident s.e.' (systems 
engineer)." 22 in view of how much cataloging still has to be 
done locally at present, this seems less likely than the purchase 
of access to on-line reference tools. 1^ either case, however, the 
technical process of actually calling up the data is going to re- 
quire classification and indexing, particularly if the query 15 on 
a subject. M need to know how to $yntnesize rubrene.*' "What 
is available on the etiology of multiple sclerosis?" I want to 
consider the degree to which modem pOets may have been in- 
fluenced by current scientific views on plate tectonics. Please 
give me all the review articles you can find on this theory.'* If 
the (JJstributlon of information is to follow the pattern of the 
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distribution of electricity or gas, the price will have to ^e lower 
than at p/esent and the quality of the end product higher. The 
idea Of an information*^utility will be an interesting thing to 
watch. The analogy may not be as applicable as it sounds. 



Unsolved Problems 

There are still at least four major unsolved problems in classifi- 
cation. The first is the problem of continuous updating. Every 
classification system is out-of-date the minute it goes to the 
press. If it goes out at noon, by one o'clock additions and 
changes have already begun to come in. Existing classification 
systems are kept up-to-dat6 by corrections and additions made 
either at regular Intervals, as with the Library of Congress sys- 
tem, at irregular intervals by international cooperation, as with 
the Universal Decimal Classification, or mainly by editions, a$ 
with the Dewey Decimal and Colon Classifications. The con- 
tinuous process at regular intervals is probably the most satis* 
factory for the user, but even here it is easyjo fall behind cur- 
rent knowledge. 

The second problem is virtually unsolved. How does one rep- 
resent objective reality adequately? Any system delineated on 
the pages of a book leaves a great deal to be desired. Subjects 
may be scattered through a multiplicity of disciplines. Hierar- 
chies with only two dimensions are unrealistic. Connections or 
splits or mergers among subjects may be hard to show. The 
trend toward interdepartmental cooperation tends to wipe out 
specific boundaries in some places and raise them in others. 
Complex systems, for example, exist everywhere, but the study 
of them excludes most of those in the humanities. 

The third problem in djassification is that we do not yet have an 
organizing philosophic baais for current thought in the late 
twentieth century. The philosophy may be here but unrecog- 
nized, or it may be in process but has not yet emerged publicly. 
As yet we cannot reorganize our body of knowledge according 
to principles more realistic for its content. This seems like a 
minor matter, but it is not. Each age has its own way of looking 
at the universe and its own body of knowledge and belief fol- 
lows that insight. No new major classification has come forth In 
such terms for the last half of the century. Perhaps part of the 
difficulty the Classification Research Group has had in produc- 
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ing their projected New General Classificatjon is that they have 
no philosophic system to hang it on. Thus they are generalizing 
paHioulars and dealing with methodologies when what they re> 
ally need Is a broad organizational pattern from a suitable 
philosophic system. 

The final major problem Is that of how to develop a completely 
open-ended system with infinite hospitality in array, chain and 
concept capture. In one way. this is a part of the first problem, 
that of continuous updating, but it is also a technological diffi- 
culty, a philosophic c ne and a notational problem. The solution 
must be partly a creative one. It seems more likely to come in, a 
flash of creativity or even by chance. But since chance favors 
the prepared mind, one must still explore all possibilities while 
awaiting insight. 



A Few Research Problems 

Some of the possibilities to prepare the ground for advances in 
classification will be listed here. They seem relatively mundane 
compared with the problems given above. In the process of 
their undertaking, they should suggest other research, which in 
turn will lead to more, until ultimately we will have a better 
basis for classification-making than exists today. The topics are 
as follows: 

f 

1 Frequency distribution study of classification numbertsubject 
heading correlation with words of titles in nonliterary works. Is 
title page classification the rule or the exception? " 

2 Frequency distribution study of the coincidence of 
classification number and first subject heading. Is this 
common practice or wishful thinking? One should get a 
ZIpf-Bradford curve if the former is actually the case. 

3 Study of built-in ambiguity in a classification system. What 
differences can be demonstrated In classifiers' interpretations 
of a given classification scheme? Can variatk>ns be eliminated 
so that abuser can count on getting everything specifically on 
the same topic in the same class? 

4 Study of variations among systems of classification. Instead of 
critical analysis with a view to abstracting *'the best." what 
could be ^aken from each for augmenting the record, thus , 
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allowing users a variety of entrance points? (This is in part 
taking the opposite approach from that in problem number 3.) 

5 Study of using variant classifications for different subjects or 
materials in the same collection. Is one grand scheme for all 
as effective as an eclectic system where the principles for 
classifying each subject would be suited to that subject or to 
the kind of material? One can think of good reasons for 
classifying government documents, serials, literary works, 
certain audiovisual material^ and possibly scientific literature 
differently from the rest of a library's collection. 

6 Frequency distribution study of the effect of classification unity 
or scatter caused by cataloging series as separates vs. analyzed 
sets, Can regularities be discovered which would suggest 
practical means of deciding between the two methods when an 
item is first received? 

7 Investigation of depth classification at the chapter and section 
heading level in monographs. Under what circumstances is the 
effort worth the results? How do these additions increase 
accessibility? Should the subject index be added? 

8 Study of cut*off levels in classification. Can one produce 
subject bibliographies evaluated critically so that the user can 
employ classification to get the level of sophistication he 
requires? (This is only partly a classification problem. It is also 
a problem of Dr. Koh's "data quality control/'^a) 

9 Cross-classification of data bases in machine-readable form. 
What linkage should be devised between rfiachine-readable 
data bases so that the user could progress from information 
retrieval to bibliographic retrieval to document retrieval from a 
single console? (This does not mean sticking a classification 
number on every word!) 

10 Classification from machine-readable text. Can a method 
^ similar to content analysis be used in conjunction with 

classification principjes to derive an automatic classification of 
any text? This assumes some kind of semantic relationships 
will be discovered and identified during the content analysis 
process, as opposed to the mathematical kind of relationships 
sought by Sparck Jones. 

11 Application of a rfiathematfcal means, .^iuch as, that of Goffman, 
to the terminology of class descriptions as a means of finding 
relationships among classes. Can such a means be used to pull 
together classes scattered among different subjects? 
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12 Study of the confusion factors m automatic classification, 
metaphor, allusion, synonymy, analogy. Is automatic 
classification possible where these confusion factors are used to 
express new ideas? Since humans take in new knowledge by 
fitting it into existing patterns and can classify by an "inductive 
leap,"2*can a means be devised to do this automatically? 



Conclusion . 

Currently it looks as if classification has taken a new lease on 
life. Twenty years ago a colleague called it "a grand intellectual 
exercise" with the implication that it did not have much value 
beyond that. Now» with extension of the range of immediate ac- 
cess to information, tremendous increase in the sheer bulk of 
information to be communicated, recognition of the inter- 
dependence of subject matter in a great many disciplines, 
technological capabilities beyond the wildest di^eam of twenty 
years ago and emphasis on quick and effective communication, 
classification is becoming more and more the entry point of 
choice. A part of this is due'to the demonstrated weakness of 
reliance on terminology alone. Nevertheless, it is rather obvious 
that classification without indexing is just as impossible as in- 
dexing without classification. The two go hand-in«hand. In con- 
clusion, one may iterate that past is prologue. New oppor- 
tunities call for tradition-shattering creativity, with a promise of 
at least the possibility of widespread usage of results. 
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